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AN ESSENTIALLY COMPLETE CLASS OF DECISION FUNCTIONS 
FOR CERTAIN STANDARD SEQUENTIAL PROBLEMS! 


By Mittron SoBe. 


Columbia University? 


Summary. A sequential problem is considered in which independent observa- 
tions are taken on a chance variable X whose distribution can be represented by 


(1) dGx) = y(e)e* du(z), 


where the parameter 6 belongs to a given interval of the real line but is other- 
wise unknown. The problem is to test H,:0 S 6* against H2:0 > 6*, where 6* 
is a given point in 2. Under certain assumptions the following class A is shown to 
be essentially complete relative to the class of decision rules with bounded risk 
functions. The decision rule 6 ¢ A if and only if after taking n observations 
(i) 6 depends on the observations only through n and v, = >>? 2; and 
(ii) 6 specifies a closed interval J,:[@in , den] for each n and the following rule 
of action 
(a) Stop experimentation as soon as v, ¢ J, and 
(1) accept H, if vn < dyn 
(2) accept He if tn > dan. 
(b) If ain < Ge, take another observation if ain <0 < don. 
(c) If ain < do, and v = a,,, accept H; or take another observation or ran- 
domize between these two (¢ = 1, 2). 
The’ Koopman-Darmois family of probability laws given above contains dis- 
crete members such as the binomial and Poisson distributions as well as abso- 
lutely continuous members such as the normal and exponential. It is interesting 
to note that the members of the class A can be obtained by starting with the 
sequential probability ratio test for testing some point #7 < 6* against another 
point 6 > 6*, namely, continue as long as 


Il Wor )e?* 
B << ee € Sf 


II vite 
i=] 


and replacing the constants B, A by two arbitrary sequences B, , A, such that 
B, S An (n = 1, 2, ---). 


1. Statement of the problem. Independent observations are taken on a chance 
variable whose distribution is given by 
(2) dGe(a) = y(0)e™ du(x) 
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where the parameter # is an unknown constant in a given subset 2 of the real 
line but is otherwise unknown and hence 


» 1 
(3) (6) = | f e au | > Ofor all @eQ. 


‘The measure u(2) is given to be absolutely continuous or discrete. The parameter 
space © is given to be an interval [9, 6] which may be finite or infinite, open or 
half-open or closed. It will be shown that a certain class A of decision rules is an 
essentially complete class for testing the hypothesis 


H,: 6 < @& against H.: 6 > 6* 


where 6* is a given constant in 2. There is also given an indifference zone Z in 
the form of an open interval (6; , 62) with 


(4) 6<A5 0% << 8, 


that is, if the true 6 ¢ Z then the loss incurred by accepting either hypothesis is 
negligible and can be set equal to zero. 

The results of this paper hold also for the case in which (2) is replaced by 
¢(8) exp {r(@)t(y)} dv(y) where r(@) is strictly monotonic in @ and ¢(Y) is an 
absolutely continuous or discrete chance variable. Letting x = ¢t(y) reduces 
this form to ¢(@) exp {ar(@)} du(x) where u(x) is absolutely continuous or dis- 
crete. All the proofs below depend only on the strict monotonicity of @ and 
therefore hold if @ is replaced by r(@) throughout. 

If 2 is not an interval we can define a strictly monotonic function @(7) from 
an interval 2* onto 2. Then considering 7 as the unknown parameter in 2* and 
using the last remark in the previous paragraph the results will still hold. 

ASSUMPTIONS. 

1. Let W(6, 7) ( = 1, 2) denote the loss incurred by accepting H; when @ is 
the true value. It is assumed that 


o W(0,1) =Ofor@ <6; W(0,2) =Ofor@> @, 
()) 
W(0,1) > Oforé > &; W(6,2) > Ofor 6 < 4 
and that the two weight or loss functions W(@, 7) (7 = 1, 2) are bounded func- 
tions of 6 on @. It is also assumed that W(@, 1) is a nondecreasing and W(@, 2) 
a nonincreasing function of 6 on ©. 
2. Let C(n) denote the cost of taking n observations on x. It is assumed that 


(6) C(n) = +2 + +++ +en [C(0) = 0] 


where c, , the cost of taking the nth observation, is a positive constant which 
may vary with n. It is also assumed that for some positive constant K 
(7) 


(7 lim inf nc, = K. 


Thm 


Let V = 4 X;, X = V/n and let x; denote the observed value of X;. 
Let S; denote the smallest interval containing all possible values of X for all 
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n(n = 1, 2, ---). It is convenient to use € to denote an arbitrary point in S; . 
We define 


(8) g:(0) = (0)e" forallteS 


It is easily seen that a maximum likelihood estimate 6 of @ based on the observed 
value Z of X is obtained by maximizing (8). The following assumptions are used 
in Lemma 1 and Theorem 2. 

3. (i) For each Z ¢ S; there exists a point 6 = 6(Z) ¢ Q such that gz(9) is strictly 
increasing in [@, 6] and strictly decreasing in [6, 4). 

(ii) There exist points 7, < Zin Sz such that 


6(Z,) ~ 6(Z2) and 0, < 6%) < % (i = 1, 2). 


2 
li ¥(6) is differentiable, then for assumptions 3(i) and (ii) to hold it is sufficient 
that for each eS; the maximum likelihood equation has a unique solution 
6(Z) which takes on all values in 2 as # runs through S; . In the normal, binomial 
and Poisson cases the reader can easily check that 6(#) = # and that the assump- 
tions are satisfied. 


3. Regular convergence in the space of decision functions. It is assumed that 
the reader is familiar with the concepts of cost, loss, risk function and Bayes 
solution. However, the definition of regular convergence in the space of de- 
cision functions given by Wald ({1], p. 65) is too general for our purposes and is 
reviewed here. 

Let x* = (a, a, +--+) denote a sequence of independent observations on a 
random variable X. A (randomized) decision function 6 following Wald ({1], 
p. 6) consists of a set of nonnegative functions 6;,(2*) 2 0, (7 = 0,1, 2; = 
0, 1,2, ---) defined for all x* and such that for all 2* 


Zz 6;,(2*) = | (n = 0, ye 


The quantities 6;,(2*) (j = 0, 1, 2) [which depend only on the first n coordinates 
of x*] represent, respectively, the probability of taking another observation, 
accepting H, and accepting Hz when n observations have been taken, the ob- 
served values are the first » coordinates of x* and 6 is the decision rule used. A 
nonrandomized decision function is a special case of the above in which the 
value of 6;,(2*) is zero or one for each n, each j and each 2*. 

Two cases are considered according as X is a discrete or absolutely continuous 
chance variable. In the discrete case a sequence {6'} is said to converge to a 
limit 4° in the regular sense if 
(9) lim 65,(x*) = 85,(2*) 

ime 
for each integer n 2 0, each x* and each j (7 = 0, 1, 2,). In the absolutely con- 
tinuous case let 


n—l 
(10) Bojn(e*) = | ida) 5jn(a0*) 


is) 
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denote the probability, given x* and 6, of selecting d; immediately after the nth 
observation. For each j and n it is assumed that this is a Borel measurable func- 
tion of z;, --- , a, . For each j and n let 


(11) Bojn(S) = [ Soin(x*) day dx. +++ dx, 
8 


where S = S, is a Lebesgue measurable set in the space of x, 22, --+ Xn. 
For n = 0 the left member of (11) as well as the integrand are defined to be 
5ojo(x*) = 5j0 (7 = 0, 1, 2). The sequence {8°} (¢ = 1, 2, ---) is said to converge 
to 6’ in the regular sense if 

(12) lim djn(S) = d0jn(S) 

for all j, all bounded sets S and all integers n = 0. It has been shown ({1], Theorem 
3.1) that the sequence {4°} (¢ = 1, 2, ---) is convergent, that is, a limiting rule 
S satisfying (12) exists, if the left member of (12) exists for all j, all bounded, 
measurable sets S and all integers n = 0. 

The definition of regular convergence for the absolutely continuous case is 
weaker than that of the pointwise type which was given for the discrete case. It 
will be useful to note that the corresponding definition of pointwise convergence 
for the absolutely continuous case implies regular convergence. This implication 
is easily seen from the fact that 


lim | d0jn(2*) dx, --+ dx, = | lim 4);,(a*) dx, --- dz, 


tm JS S t=o2 


for all j, all bounded, measurable sets S and all integers n 2 0. This equality 


follows from the Lebesgue theorem since 0 < 6);,(z*) S 1 and S is bounded. 


4. Relation of this paper to a result of Wald. The purpose of this paper is to 
give an essentially complete class of decision rules for the problem described 
above. The following theorem of Wald ({1], Theorem 3.19) is used in the proof. 
Let D, denote the class of all decision rules whose risk functions are (uniformly) 
bounded in 2. Let ¢ denote a class of a priori edf’s such that for any & gz ¢ there 
is a sequence {£;} of members of ¢ such that 
(13) lim &\(w) = £0(w) 

imo 
for any measurable subset w of 2. Let B; denote the class of all Bayes solutions 
relative to members of ¢. Then the closure B; in the regular sense of B; is es- 
sentially complete relative to Dy. 

To obtain this result, several assumptions are made on the cost function, the 
loss function, the space D of decisions and the set of decision rules available to 
the experimenter. The verification of these assumptions under the more restric- 
tive assumptions of this paper is mostly trivial and is omitted. See ({1], chap. 3). 

An a priori edf — will be called nondegenerate if it assigns positive probability 
to every open subset of 2. It will now be shown that the class ¢ of nondegenerate 
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cdf’s satisfies the hypothesis of Wald’s theorem. Let & be any edf not in ¢. 
Let &* be any specified cdf in ¢. Let {e;} (i = 1, 2, ---) be a decreasing sequence 
of numbers such that 0 < e; < 1 and limi, ¢; = 0. Define 


(14) (0) = €€*(0) + (1 — €,)&(8). 


Clearly £,(@) is a cdf in ¢ for each i and we have for any measurable subset w 
of 2 
(15) lim &,(w) = f(w). 

t=00 


Hence the class ¢ satisfies the hypothesis of Wald’s Theorem 3.19. 


5. The essentially complete class A and outline of the proof. If a decision 
rule 6 depends on the observations only through »v = >>: x; for each n (n = 
1,2, ---) then 6;,(v) will be used to denote the probability of accepting H; 
given the pair (n, v). The class A of decision rules is defined as follows. The 
decision rule 6 ¢ A if and only if for each positive integer n the following hold. 

(i) 6 depends on the observations only through v. 

(ii) 6 specifies a closed interval J,(5): [ai,(6), @2n(6)] or simply Ja: [ain , don] 
which determines the action to be taken as follows. 

(a) Another observation should be taken (i.e., do,.(v) = 1) if ay, << vu < an. 

(b) Experimentation is stopped as soon as v zg J, and 

(1) H, is accepted (i.e., 6:.(v) = 1) ifv < ayn 
2) He is accepted (i.e., 6,(v) = 1) ifv > ay. 

(c) If ain < don then 6,(Qjn) = d1n(Gen) = 0. 

Condition (c) can be omitted when yu is absolutely continuous since it concerns 
only a set of Gs-measure zero for each 6 ¢2. The closed interval [a,, , @2,] may 
reduce to a point but will always be nonempty; however, in the discrete case it 
need not contain any points of positive probability. Then the principal result of 
this paper will be that the class A is essentially complete relative te D, . Clearly 
an immediate consequence will be: 

Corotiary 1. The class A, of all decision rules in A, whose risk functions are 
bounded in Q, form an essentially complete class relative to Dy . 

The proof of the principal result consists in showing 

(a) that A is essentially complete relative to A, and 

(b) that every Bayes solution (if altered at most on a set of measure zero) 
belongs to A and hence that B, C A. It then follows from Wald’s Theorem 3.19 
and (b) that A is essentially complete relative to D, . Using (a) it follows that 
A has the same property. The paper ends with a corollary which shows that the 
same result holds if no indifference zone is given. 


6. A is sequentially compact. In order to show that A is essentially complete 
relative to A it is clearly sufficient to prove the following theorem. 

Turorem 1. For any 6 € A there is a &* ¢€ A which is equivalent to 6, that is, for 
which 


(16) r(6, 6*) = r(@, 5) for all 6 € Q. 
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Proor. To prove the theorem it is sufficient to show that for any sequence 
‘\ (¢ = 1,2, ---) of members of A which is convergent in the regular sense 
(1) there exists a limit (in the regular sense) 6 which is in 4 and 
(ii) any two limits (of the same sequence) are equivalent. In the discrete case 
« much stronger result holds, namely A = A. Since in the discrete case the set 
of quantities 6;,(a2*) for all possible triples (n, j, x*) determines the decision 
function completely then by (9) the convergent sequence {5°} must have a 
unique limit 6*. It remains only to show that 6* e A. 

In the absolutely continuous case it is easily seen by (11) that any two limits 


of the same sequence can differ at most on a set of n-dimensional Lebesgue meas- 
ure zero for n > O and not at all for n = 0. Hence they must be equivalent. It 
therefore remains in both cases only to identify the limit 6* and show that 
é* 2 A. 

1. Consider the sequences {ag,(6')} (@ = 1, 2, ---) for B = 1, 2, and 
n = 0, 1, 2, ---. Since each element of each of these sequences lies in the same 
compact set [the real line with + 2 added] then by the method of diagonaliza- 
tion there exists a subsequence {7,} of the positive integers such that each of 
the above sequences converges. Let as, denote the corresponding limits for each 
3 and each n. Then clearly 
(17) © S din S A, d (n 1.2, :**=). 

2. Consider the set of sequences [65%.(ds,)} (@ = 1,2, -+-) for 8 = 1,2;7 = 
0, 1,2 and n = O, 1, 2, --- . Since each element of each of these sequences lies 
in the closed interval [0, 1] which is compact, then by diagonalization there 
exists a subsequence }7,} of {74} such that each of the above sequences con- 
verges. Let 6;,(ag,) denote the corresponding limits for each 8, each j and each n. 
Then clearly 0 S 6;,(as,) S 1 for each triplet (3, 7, ~) and 


(18) > djn(ag.) = 1 (8 = 1,2; n =0,1,2,--- 
j=0 


The symbol 6 jo(ago) denotes the constants 6j» (7 = 0, 1, 2). Since {1,} is a sub- 
sequence of {ia} the limits as, remain unchanged. 
3. Let J, denote the nonempty closed interval {a;, , @,] and let 6* denote the 
following decision rule. 
(i) 5% = bj (j = O, 1, 2). 
(ii) If aa, < v < ao, then 69,(v) = 1. (Continue experimentation). 
(iii) Ifvg J, then d3,(v) = 0 (Stop experimentation and) 
(a) 6.(v) = lforv < ay (accept H,) 
(b) 52 ,(v) = lforv > Gen (accept H2). 
(iv) If v = ag, then 53, (Agn) = 6;,(dgn) (8@= 1,2; 7 = 0,1, 2). 
(Condition (iv) as well as paragraph 2 could be omitted in the absolutely continu- 
ous case since they concern only a set of measure zero.) 
4. Clearly 6* ¢ A and it remains only to show that 6* is a limit of {6 
regular sense. The above construction was such that 


t 


| in the 


(19) 5*,.(v) = lim 633,(v) for each triple (n, j, v), 


y=00 
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that is, such that the subsequence {67} converges pointwise to 6*. Hence by 
Section 3 {6°} converges to 6* in the regular sense, that is, for the discrete and 
absolutely continuous cases respectively 

P ° adi *x ° > 

(20) lim 633.(v) = 6;,(v) for each triple (j, n, v) 


a) 


cat ‘ i % .* , ° ps . 

(21) lim d0%n(S,) = dojn(S,) for each triple (j, n, S,)- 
y=20 

By hypothesis the full sequence {6°} is convergent in the regular sense so that 

the left members above converge for the full sequence. They must therefore con- 

verge to the same limits as the subsequences, namely, the right-hand members 

above. Hence 6* is a limit in the regular sense of {5°}. This proves Theorem 1. 


7. Terminating property of the Bayes solutions in B;. The proof given here 
that (almost) every Bayes solution with respect to a edf — ¢ ¢ belongs to A rests 
on the fact that every Bayes solution in B; terminates in a finite number of steps 
with probability one. Assumptions 3 are used only for the latter result which we 
shal! now prove after some definitions and lemmas. 

Let — denote an arbitrary cdf over Q. The a posterior probability that 6 be- 
longs to any Borel measurable subset w of 2 given the triple (n, v, £) is 


/ [y(o)]"e" de(6) 
(22) (9) = =~ —_______ 


* de(@) 


The notation & for the a posteriori edf given the triple (n, v, &) is justified since 
the right member of (22) depends on the observations only through n and v. 
£, will then denote the original a priori edf € for all v. 

Following Wald ({1], Section 4.1.1) certain functions po, pi, pa, p» Will be 
defined which can be used to describe the class B; of Bayes solutions relative 
to the a priori cdf €. Let D*, Dy and D, denote respectively the class of all decision 
rules, the class of all decision rules which require that no observations be taken 
with probability one and the class of all decision rules which require that at least 
one observation be taken with probability one. Wald defines for any triple 
(n, v, £), 


(23) polée) = inf r(é’, 5) 
beDo9 


(24) (+ inf r(é,, 6) 
éep* 


(25) pilg) = inf r(é", 6). 
éeD, 

(Actually the function p(§!) depends on the sequence (¢n41, Cn42, °**) Which 
may vary with n and should be written p”(&) unless all the c; (i = 1, 2, ---) 
are equal. To simplify notation and since the proofs are not affected it will be 
assumed that the superscript on £ also applies to p. The same remark applies 
to (25).) Since Dy) C D* then clearly for any triple (n, v, &) 


r 


(26) P(E) = pol) — p(t) = O. 
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Wald has shown ({1], Theorem 4.2 and 4.1) that for any edf & 
(27) p(&) = min [po(&), or(é )] 


and that 


(28) pilée ) = [ [ pfs, dGs(y) dé; (8) + Cn+1- 
Q oo 


If ge(x) denotes the integrand in (2) then (28) can be written as 
i) 5 
(29) pl) = [| plEery )go(y) duly) de (8) + enyr- 


Denote by w: , w2 respectively the closed intervals [6. , 4], [8, 4:] in Q. Let 


a 


(30) pee) = [ WO, 1) ar) = [| Wer) aera) 


(31) o(é”) [we asr(6) = | W,2) der) 
Q @2 


where the last expression in each case follows from (5). Clearly, p,(&) and 
po(é ) denote respectively the risks of accepting H; and H, given the triple 
(n, v, &). Then by (23) 


(32) pol) = min [pa(&), po(é& )). 


The class B; of Bayes solutions relative to the fixed a priori cdf € can be de- 
scribed as follows. 

(i) If pilf) < po(&) for some pair (n, v) then for any 6 zg D, there exists a 
decision rule 6; ¢ D; such that r(&, 6) < r(&, 6). Hence for such a pair (n, v) 
any Bayes solution 6; ¢ D, , that is, any Bayes solution will prescribe another 
observation with probability one. 

(ii) If pa(é") < min [p,(é,'), p:(é )] for some pair (n, v) then similarly any Bayes 
solution 6; will accept H,; with probability one when the pair (7, v) is observed. 

(iii) A similar result holds for Hz when ps(& ) < min [p.(&'), pi(&2*)}. 

(iv) If pal) = po(é) < p,(&) for some pair (n, v) then any Bayes solution 
6: will accept H, , H2 with probabilities p, 1 — p respectively where 0 < p S 1 
when the pair (n, v) is observed. (It will be a consequence of Lemmas 3 and 5 
that when yu is absolutely continuous the equalities in (iv) through (vii) take 
place at most on a set of v points of Gs-measure zero for each 6 ¢ 2. The Bayes 
solution can be arbitrary on such a set since it is defined only up to a set of 
measure zero. ) 

(v) If pi(&) = palt) < po(é>) for some pair (n, v) then any Bayes solution 6; 
will accept H, and take another observation with probabilities p and 1 — p 
respectively where 0 < p S 1 when the pair (n, v) is observed. (In (v) through 
(vii) it is assumed that the value p,(é,) is attainable with some 6 ¢ D, for each 
pair (n, v). This is actually a consequence of ({1], Theorem 3.2).) 

(vi) A similar result holds when p;(é) = ps(&) < pa(é). 
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(vii) If pi(&') = palt) = po(&') for some pair (n, v) then any randomized or 
nonrandomized decision rule is a Bayes solution when the pair (n, v) is observed. 

Let 6 denote the maximum likelihood estimate of @ given the pair (n, v). 
Since 6 depends on v/n it is convenient in Theorem 2 to consider the statistic 
E = v/n instead of v itself. Then & will denote the a posteriori cdf given the 
triple (n, #, &). 

Lemma 1. The function 6(@) which is uniquely defined by Assumption 3(i) is 
nondecreasing. 

Proor. Consider two points #; < # and let 6; = 6(%,) and 6. = 6(%). By as- 
sumption 3(i) we have for all 6€Q 


(33) V(b, = yore? 
(34) (b,c? = y(oe*?. 
Hence putting 6 = 6 in (33) and @ = 6, in (34) and dividing yields 
6.(%. — #,) = 6,(#2 — &,). 
Since f2 > #, , it follows that 
(35) I. = O(%.) = (7) = 6. Q.E.D. 


THeoreM 2. If & € ¢ then there exists an integer N = N(&) such that any 
Bayes solution 6; relative to & will terminate before N + 1 observations with prob- 
ability one. 

Proor. Let # < Z. denote the two points mentioned in Assumption 3. 
By Lemma 1 and this assumption 6, < 6(%:) < 6(%) < 62. Define 


» = 3(6(%:) + 6(%.)] 
Let 6‘ denote the following terminal decision rule: 
“Accept H, if (7) < 6, accept Hz if 6(z) > 0 


cy 2 . . . Y . astm 
Let S; and S; denote respectively the set of points in S; for which 6(%) < 4, 


ee » al @ ° 
and 6(Z) > 6). These sets are clearly not empty. If 5 is used the average risk 


given the triple (n, Z, &) is 
‘| W (0, 6) d& (8) 
#1 


(36) 


/ WG, 8) dé2(6) 


In order to show that 


(37) lim r(é?, 6‘) = 0 uniformly for all # ¢ S; 
hw 
it is sufficient to show that the upper value in (36) tends to zero uniformly for 
ja vl és aa : 
all ¢ « S; and a similar result for the lower value. Since the two cases are alike, 
only the first will be shown. Since by assumption 1W’(@, 5‘) is a bounded function 
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of 6 and since g:(@) > O then using (22) with v replaced by nZ it is sufficient for 
(37) to show that 


[ Gor ago) 


(38) lim | —“—_ | = 0 uniformly for Z ¢ S} 


“1 (aor ax) 


where w is any subset of 2. Take for wo the interval (@ , 6.) where 6 = 6(#2). 
For any n by Assumption 3(i) the function [g:(@)]” is strictly decreasing in the 
interval (6(Z), 6) and hence also in the subintervals (@ , 6.) and (2, 4). Hence 
an upper bound for the expression in brackets in (38) is 


(gs(b.))" (wo) 


Since £¢ ¢ and @ < 6 then £(wo) is a positive constant. By (8) 


(40) Kal ‘ [ee ate), 
gz (62) ¥(6.) ; 


Since 6 < 6. = 6(Z2) it follows from Lemma 1 that # is an upper bound for the 
set S;. Hence by (40) since 6 > 6 it is sufficient to show that 


(41) lim Ball = 0. 
2, (62) 


The function g;,(9) is strictly decreasing in the interval (6, , 62). The expression 
in brackets in (41) is therefore a positive constant less than unity and (41) 
follows. This proves (37). 


Teme 


It follows from (41) that the approach to zero is at least exponentially fast. 
On the other hand, by Assumption 2 the constants ¢, may approach zero but not 
fast enough for es c; to converge. It follows from (7) that there exists an in- 
teger V = N(é) such that forn = N 


(42) O S p(t?) S r(&, 8) < cays for all Ze S;. 
By (28) it follows that p:(&?) 2 cai: for all pairs (nm, @) and hence for n = N 
(43) poléz) < pilE?) for all Ze S;. 


It follows from the description of B: above that any Bayes solution 6; will ter- 
minate experimentation before NV + 1 observations with probability one. This 
proves Theorem 2. 

The above result does not hold in the case of testing a simple hypothesis 
against a simple alternative. For example every Wald sequential probability 
ratio test which consists of a pair of parallel lines does not have the above prop- 
erty. It would be interesting to determine whether we can restrict our attention 
to tests consisting of pairs of converging straight lines. Although Theorem 2 is 
useful here as a tool it appears to have some interest in its own right. 
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Let S; denote the interval generated by v = nz as Z varies over S; . Since 
(43) holds for all #¢ S; it will also hold for all ve S}, that is, 


(44) polte) < pilé&) for all ve Sy. 


Hence by (27) forn 2 N 


(45) p(é&) = polé ) for all ve SY. 


In the normal case the sets S; are equal to the real line for each n. If the sets 
S, actually depend on n, as in the binomial case, the range of validity of (44) 
and (45) varies with n only because & may not be defined for all v. Hence (44) 
and (45) hold whenever the expressions therein are defined. 


8. Other properties of the Bayes solutions in B; . The purpose of this section 
is to show that (almost) every Bayes solution 6; relative to a — ¢ ¢ belongs to A. 
In the discrete case this holds for all Bayes solutions 6; with & ¢ ¢. In the abso- 
lutely continuous case the result is slightly weaker, namely, for each — ¢ ¢ any 
Bayes solution 6: not in A differs from another Bayes solution 6; in A at most on 
a set of n dimensional] Lebesgue measure zero for each positive integer n. 

Let 


(46) filg) = pol&) — pi(&) 
(47) fale) = poli) — palé?). 


Thus fi(é) can be regarded as the advantage of taking another observation 
over stopping for the given triple (n, v, £). Similarly f2(é?) is the advantage of 
accepting H; over accepting H2 for the given triple (n, v, &). 

It follows from the description of B; above that in order to prove the desired 
result it is sufficient to show the following for each & ¢ ¢. 


For each positive integer n there exists a nonempty closed interval 
J,(E): [asn(E), Qen(E)] or simply J, : [ain, @on] 
such that 


(48) pv(ts) > pilé?) for f2(é?) > O} for a, <v < a@ 


r 


n 


(49) po(to) < pi(é?) for (é?) < 0] forve J, 


(50) ps(ée) > palér) [or fa(e) > 0] forv < din 


n 


(51) pr(&) < pal& ) [or P(e) < 0) forv > da, 
and if din < Gon 

(52) p(t) > min [pa(é), pr(és )] forv = ay, 
(53) palé.) > min [po(E?), or(E> )] forv = dy. 


Equations (52) and (53) are superfluous for the absolutely continuous case 
since they are concerned with a set of Gs-measure zero for each @ ¢. In other 
words, the Bayes solution obtained by pointwise minimization of the average 
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risk is to continue as long as ai, < v < d2,. Experimentation is stopped as soon 
as vg J,. Then H, or Hz is accepted according to which involves the smallest 
risk. To show for any £ €¢ the existence of J,(&) satisfying (48) through (53) 
for each n will require several lemmas and theorems. Let 


(54) Sole) = ps(Er) — po(E) = max [fa(ér), 0] 


where the latter equality follows from (32) and (47). In what follows 7 will de- 
note an integer, v a point in S,’ and 9 a point in 2 even if this is not explicitly 
stated. Let 


[W(@)|"er? 
[ wore? ago: 


a 


(55) pe (6) = 


Then p; (8) is a probability law relative to the measure £6). 

Lemna 2. Let £(@) be any cdf on Q. Let W(0) be a bounded, nonincreasing func- 
tion of @ which is not constant on a set of &-measure unity. Then for all n > 0 and 
all pairs v4) < 


(56) [ W(6) dé;,(@) > [ W (6) dé?,(). 
Q “2 


If the last condition on W(6) is omitted then the weak inequality holds. 

ProoF.’ Since v. > v; the ratio p;,(6)/p:,(8) is a strictly increasing function 
of 9. Let 2+ and Q— denote respectively the intervals on which p;,(6) > and 
< p.,(9). Then 


[ W (8) dlét(@) — &7,(0)] = [ W(@)[pe,(@) — pr,(@)] d&(@) 
Q “O+ 
(57) 


d 


> 


— [| Weolp@ — v0) a) 
Since W(@) is nonincreasing there is a constant ¢ such that 


(58) [ W(@)[p?,(0) — pe,(@))dt(@) < [ [pr (0) — pr, (8) dé(@) 


“2+ 


a o 


(59) | W(@)[p.,(0)] — pr,(@) dé(@) = I [p:,(0) — pr,(@)] dé(@). 


If 1V(@) is not constant everywhere (£) at least one of the above inequalities can 
be replaced by the strict inequality. Since p;,(@) and p;,(@) are both probability 
laws relative to € then 


(60) [ [p-.(8) — p.,(@)] dé (8) = 


JO 


[p:, (0) — p.,(@)] dé(@). 


J0+4+ 


’ The proof of this lemma was kindly offered by Professor Erich L. Lehmann. 
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It follows from (57), by the use of (60) and the revised inequalities (58) and (59), 
that 


(61) [ WO deo - 20 <0 


which proves the lemma. 

Clearly if £ ¢ ¢ then it is sufficient in the above lemma to assume that W(@) 
is bounded, nondecreasing and not constant throughout ©. 

Coro.uary 2. Let f(y) be a bounded nonincreasing function of y = Xn+1 which 
is not constant on a set of u-measure unity. Then for all pairs 0; < 2 


(62) [ 50) dG.) > [ IW) aG.,(v). 


ProorF. Since for 62 > 4, the ratio ge,(y)/go,(y) is a strictly increasing function 
of y the proof is exactly as in Lemma 2. 

Corouuary 3. Let fi(y) and fo(y) be bounded functions of y = nx, such that 
fily) is nonincreasing and 


(63) fily) = fely) for all y. 
Then for any &, alln > 0 and all pairs vy, < 


(64) [ [ fily) dGo(y) d&%,(6) = [ [ fely) dGo(y) a82,(0). 


Proor. Since fi(y) is nonincreasing then by Corollary 2 for any pair 6; > 6: 


a 


5) = W@) = [| AW) aGay) = [ AW) aGy) = Wi), 


that is, W,(6) defined in (65) is a nonincreasing function of 6. If we define W2(@) 
similarly with f; replaced by fz then by (63) 


(66) W,1(0) = W2(0) for all 6€Q. 


Also since f;(y) (¢ = 1, 2) are bounded functions of y then clearly W;(@) are 
bounded functions of 6 (¢ = 1, 2). Hence for all pairs 1, < v. by Lemma 2 


(67) [wv dé? (6) = [wo dé-',() 
2 Q 

and by (66) 

(68) [ wie) ae"(0) = [ Wo agro), 


The result (64) follows from (67) and (68). 
Coro.iary 4. If in addition to the hypothesis of Corollary 3 the inequality (63) 
is strict on a set S* of positive u-measure then the inequality (64) ts a strict one. 
PRooF. Since ge(y) > O for all y and all 6 the added hypothesis implies that a 
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strict inequality holds in (66) for all @ ¢ Q. The strict inequality will then hold in 
(68) and hence also in the final result (64). 
Lemna 3. For any £ ¢ {, alln > 0 and all pairs vy, < v2 


. b 
(69) falér,) > falr), 
fy ob sn - n> 
(70) folée,) = folé,) = 0, 
(71) folér,) > folér,) if REN) > 0. 
In words, as v increases 

(i) the advantage of accepting H2 over accepting H, increases, 

(ii) the regret in accepting H2 if we have to make a decision without further 
observations is nonincreasing. If there is any regret at all it must actually de- 
crease. 

Proor. The functions — W (6, 1) and W(@, 2) satisfy the assumptions of Lemma 


2 for any ¢¢. Replacing IV (@) by W(6, 2) and —W/(@, 1) in (56) gives respec- 
tively 


po(ée,) > polErr), 
(73) pal.) < pa(&t). 


Subtracting (73) from (72) gives the desired result (69). With the aid of the last 
equality of (54) the results in (70) are immediate consequences of (69). To prove 
(71) note that if fo(ét,) > 0 then by (54) and (69) 

(74) folée,) = face) > fa(E%). 


a\Svo 
By (54) the value of fo(£!,) is either fo(é?,) or zero. In the former case (71) fol- 
lows from (74) and in the latter case the result follows from the assumption 
that ft(e?,) > 0. This proves Lemma 3. Let 

= p(t) — milés), 
(76) fee) = pall) — m(E2). 


It will be shown that f; is nonincreasing and that ff is nondecreasing in v when 


Ee ¢ and n > O are fixed and that these monotonicities are strict whenever the 
corresponding functions are nonnegative. It will also be shown that if we define 


of -_— 


— r . ob s.r a n nr 
(77) fit) = min [fi(E), (EP) = pol&e) — mE) 


then there exist two points ai, S de, such that 
fi(é) = O if and only if a, Sv S ar, 
and if ay, < de, then 
: 
filé) > 0 for Gig < 0 < ay. 


ny ° ‘ . ° = ite ob ny 
These are the points involved in equations (48) through (53). To show that f;(&) 
is a nonincreasing function of v (Lemma 5) the following result is needed. 
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Lemma 4. For any cdf &, alln = 0 and all v 


(78) eles) =f [ eslerte) dGiaty) aer@ 


where y = Ins. 

The proof is omitted. Intuitively the lemma says “‘if there is no cost of taking 
observations and H; is going to be accepted after the next observation then 
nothing is gained or lost by accepting H2 now.” 

Lemma 5. For each &€¢, all n > 6 and all pairs y, < v2 


(79) file) = Ale) 
(80) fer) > KER) if filgt,) = 0. 


Proor. An induction on n will be used. The theorem is first shown forn 2 N = 
N(&) defined in Theorem 2 for each & € ¢. Substituting (78) and (28) in (75) and 
using the fact that (45) holds for n 2 N gives 


(81) file) [ [ flee) dGe(y) dé?,(0) — ens 


(82) file?) = | / foleecs) dGe(y) dé, (0) aT Cn+1 
“2 wo 


= Y = Tati. 
) Corollary 3 will now be used with fily) and f.(y) replaced by fo oe? 
a Res) respectively. The functions fo are clearly bounded by (54). By 
(70) o(E+y) is a nonincreasing function of y and (63) also follows from (70). 
It therefore follows by Corollary 3 that the double integral in (81) is not less 
than the double integral in (82). This proves (79) for n 2 N. 
(ii) If v, is such that fe) = 0 than by (81) 


(83) ix folgerry) — cass} | fo y) a0) | du(y) = 0 


where go(y) is the integrand in (2). Since for each y the function gs(y) > O for 
each 6 the expression in brackets must also be positive for each y. Hence 


‘ +1 
(84) fe) S Cass > 0 
on a set S* of positive u-measure. Then by (71) since 1 < v2 


(85) Solent) > folerr) for ye S*. 


Svety 


It follows from Corollary 4 that (80) holds for n 2 N. 
If N = 1 the theorem is proved. Otherwise assuming the theorem holds for 
=k+1,k 4+ 2, ,N(N =k +1 > 1) it will be shown to hold for n = 
k. Let 


(86) fer) = pl?) — p(t?) = max [fo(E"), ACEP] 


s 
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where the last equality follows from (27). Substituting (78) and (28) in (75) 
and using (86) gives for any n (and in particular for n = k) 


(87) ne.) = | [PER acuta) ae 0) = cus 


(88) filé.) = [E I (Gente) dG4(y) det, (8) — Cr+ 


where y = 241. 

(i) Corollary 3 will again be used with f,(y) and f(y) replaced by f’(e*t,) 
and f°(e#tt,) respectively. The boundedness of these functions follows from 
(86) and (27). It remains only to show that f i is a nonincreasing function 
of y since (63) follows easily from this result. Let y: < y2 denote two possible 


values of y. By (70) 
(89) foert? ty) > folttttys) ; 
and by the induction hypothesis on (79) 

k ‘ ‘ 
(90) A(t) 2 = fil&e E014 ve) 
Hence using the last equality of (86) 
, s k+ wk+1 
(91) fr (Sstee:) = > f° (& i+y2) 


It therefore follows from Corollary 3 that the right member of (87) is not less 
than the right member of (88). This proves (79) for n = k and hence for all n > 0. 

(ii) If », is such that f7(&,) = 0 then proceeding as in (83) and (84) we obtain 
from (87) and (86) the result 


‘ k+1 \ +1 k ; ; 
(92) max [fo(fitty), f(g) = P(A) = cea > 0 


on a set S* of positive u-measure. Consider any y ¢ S*. 
. “or . ; iia i - 
Case 1. If fo is the larger in (92) then by (71), (92) and (79) 


(93) Pah) = fret) > felEetty) 


a 6.40) fvoty)> 


(94) fers > Rey =z Kees). 


Svity Svety 

Case 2. If ff is the larger in (92) then by (70), (92) and the induction hy- 
pothesis for (80) 
(95) Sf’ Gore) = fleets) > SilGetes) 
(96) Sf’ (Getty) > Solorty) S folEerry)- 

Case 3. If f? = f3 in (92) then both (93) and (95) hold. Hence in each case 
for all y e S* 

= k+l ryk+l eb 
(97) f (fer+y) > max [fo(ge (Essty)» fil (fv2+y)] =f i.) 
It follows from Corollary 4 that the right member of (87) is greater than the 
right member of (88). This proves (80) for n = k and hence for all n > 0. 
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If we define f3 , fo, fi and f” in exactly the same manner as f2, fo, fr and f” re- 
spetively except that p, and p, are interchanged, then the following theorem 
can be proved in a manner completely analogous to that above. 

Lemma 6. For each — ¢ £, alln > 0 and all pairs v; < 


(98) Hil&e,) S filées) 


S02 
(99) Filey) < fi(eey) if fi(e,) 2 0. 
Just to indicate why the inequalities are reversed the reader will note that the 


analogue of (69) holds with reversed sign since f{ = —f?. 
THEOREM 3. For each — ¢ ¢ and each n > O there exists a nonempty closed interval 


J n(€): [ain(E), Gen(E)] 


such that equations (48) through (53) hold. 
Proor. Define a2, = a2,(£) as the greatest lower bound of values of v for which 


(100) po(é>) S min [pa(ée), pilEe )] 


if any such values exist and define it as ~ otherwise. Define a, = ai,(£) as the 
least upper bound of values of v for which 


(101) pa(E:) S min [p(E), pilEr)] 


if any such values exist and define it as — ~ otherwise. 
(i) To show that ai, < a2, we note that for any v satisfying (100) by (69) 


(102) fer) < fe®) <0 forv > w%. 


Since this contradicts (101) it follows that the least upper bound of v values 
satisfying (101) is at most equal to the greatest lower bound of v values satis- 
fying (100). Hence a;, S a2, and a nonempty closed interval [a;, , dn) is there- 
fore defined for each n > 0. 

(ii) For any v points such that a, < v < ad, neither (100) nor (101) holds 
and hence using (32) 


nm 


(103) p(t?) < min [pa(Er), ps(")] = po(€?). 


This proves (48). 
(iii) For vy > a», (if such values exist) by (100) and (79) 


(104) ne.) < 0. 


If strict inequality holds we apply (79) and if equality holds we apply (80), 
but the two results are the same, namely that 


(105) f(é) <0 for every v > wy. 


Since vp can be taken arbitrarily close to the greatest lower bound a, then 


(106) file) <@ for every v > doy. 
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(The reason for introducing 1% is to avoid the proof of continuity of ff in v. Al- 
though the continuity holds it is not needed for our purposes.) Sim .arly by 
(101) and Lemma 6 it follows that 


(107) file) <0 forv < Qin. 
Hence by (106) and (107) 


(108) po(é>) = min [pa(E>), ps(E>)] < pilé?) for everyvzeJ,. 
This proves (49). 
(iv) By (100), (101) and (69) 


(109) fal&) <0 
(110) frl&) <0 


This proves (50) and (51). 
(v) By definition of a, , a2, it follows that 


(1 11) pro(ée ) > min [pa(ée ), pilé: )] for v a2 


(112) palé:) > min [p(E>), pi (E> )] forv > Qin. 


Hence if a;, < a2, then (111) holds for v = a,, and (112) holds for v Aen . 
This proves (52) and (53) and completes the proof of Theorem 3. 

As mentioned above, it follows in the discrete case that for each & ¢ ¢ the class 
B: Cc A and hence B; C A. Hence by the definition of closure B; C A. In the 
absolutely continuous case Theorem 3 shows that for £ ¢ ¢ there exists a Bayes 
solution 6 = 6; which belongs to A and that any other Bayes solution 6’ = 6: 
relative to the same — can differ from 6 at most on a set of n-dimensional Lebes- 
gue measure zero for each positive integer n. Hence by (11) 


(113) 5onj( Sn) ” 5onj(Sn) 


for each n 2 OQ, each j and every Lebesgue measurable set S, in the space of 
Xn 


ee oe . Hence, given any sequence {6'} (¢ = 1, 2, ---) of members of 
B; which converges in the regular sense (see Section 3), we can for each 7 re- 
place 6‘ by a member of A without altering the convergence or the limit. Clearly 
then, since any member of B; is a limit of members of B; it is also a limit of 
members of A and must therefore be in A, that is, B; c A. 

It now follows from Wald’s Theorem 3.19 (see Section 4 above) that A is 
essentially complete relative to D, in both the discrete and absolutely continu- 
ous cases. Then using the result of Section 6 the final result is obtained, namely 
that A is essentially complete relative to D,. 

Coro.tuary 5. If in the above problem the indifference zone ts the empty set and 
6(#) takes on values between 6* and 6* + e for every « > O then the class A remains 
essentially complete relative to Dy . 

Proor. Consider the sequence of problems P‘ (i = 1, 2, ---) in which the 
parameter space 2 remains fixed and the indifference zone Z* is the open interval 
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(6*, @;) where 6* < 6:4; < 06, < 6 for each 7 (¢ = 1, 2, ---) and lim,, 6; = 
6*. Then the limiting problem P” has an empty indifference zone. In order that 
Assumption 3(ii) hold for each problem P’ it is sufficient that among the values 
assumed by the function 6(Z) there is a strictly decreasing sequence approach- 
ing 6* By the above the class A is then essentially complete relative to D, 
for each problem P* (¢ = 1, 2, ---). Let 5 be any decision rule for P”. Then 4 is 
aiso a decision rule for each problem P*. For each i (i = 1, 2, ---) there exists 
a decision rule 6;¢ A such that 


(114) r:(0, 6;) S ri(0, 5) for all 9e€ 2 


where r,(@, 6) is the risk function when 6 is the decision rule used and P’ is the 
problem. Since r,(6, 5) differs from ro(6, 6) only on the open interval (@*, 6,) then 
by (114) 


(115) ro(8, 5;) S ro(8, 5) for @e2 —Z". 


Wald has shown ({1], Theorem 3.2) that we can extract from the sequence 

{6,;| of members of A a subsequence {6;,} which converges in the regular sense 

to a limit 5) and which is such that 

(116) lim inf r9(0, 6;,) 2 ro(, do) for all @ €Q. 
a=r 

By Theorem 1 it can be assumed that 6) € A since otherwise we can find an 

equivalent rule in A. Consider any point @ ¢ Q. For sufficiently large 7 the 


point @¢@ — Z* so that (115) holds. Hence by (115) and (116) 


(117) ro(8, do) Sro(O, 5). 
Since 6 is arbitrary (117) holds for all @ ¢ 2. This proves the corollary. 
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1. Summary. The stochastic processes which occur in the theory of queues 
are in general not Markovian and special methods are required for their analysis. 
In many cases the problem can be greatly simplified by restricting attention to 
an imbedded Markov chain. In this paper some recent work on single-server 
queues is first reviewed from this standpoint, and the method is then applied 
to the analysis of the following many-server queuing-system: 

Input: the inter-arrival times are independently and identically distributed 

in an arbitrary manner. 

Queue-discipline: “first come, first served.” 

Service-mechanism: a general number, s, of servers; negative-exponential serv- 

ice-times. 

If Q is the number of people waiting at an instant just preceding the 
arrival of a new customer, and if w is the waiting time of an arbitrary customer, 
then it will be shown that the equilibrium distribution of Q is a geometric series 
mixed with a concentration at Q = 0 and that the equilibrium distribution of 
w is a negative-exponential distribution mixed with a concentration at w = 0. 
(In the particular case of a single server this property of the waiting-time dis- 
tribution was first discovered by W. L. Smith.) 

The paper concludes with detailed formulae and numerical results for the 
following particular cases: 

Numbers of servers: s = 1, 2 and 3. 

( 


Types of input: i) Poissonian and (ii) regular. 


2. Introduction. This paper follows an earlier one [8] in which the reader will 
find a detailed account of the history of the subject, the technological applica- 
tions and the conventions of notation and terminology, but a thorough famili- 
arity with the contents of [8] will not be assumed. A queuing-system of the type 
to be considered is specified when we know (i) the input, (ii) the quewe-discipline 
and (iii) the service-mechanism. It will here be supposed that if the successive 
“customers” demand service at the epochs --- , t,, t-41, °°: , and if u, denotes 
the inter-arrival time f,.; — ¢, , then the random variables ---+ , v,, Ups1, °** are 
statistically independent and enjoy the same (arbitrary) distribution dA(w) 
(0 S u < &). (Note that in some important applications this supposition of 
independence will not be admissible; for example, it cannot be made when the 
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“customers” (which may be ships, or aircraft) are scheduled to arrive at specified 
times and are late or early by independent random time-errors.) When no further 
assumptions than these are made about the input I shall describe it as a general 
independent input and in the label denoting the system this state of affairs will 
be indicated by the letters GJ. (Here I follow Lindley [3]; in [8] I called such an 
input “regenerative,” but in the present paper I shall make no explicit use of 
the concept of a set of regeneration points (for this, see [8] and the subsequent 
discussion).) There are two important special types of input: 

és D (deterministic, or regular): 

: A(u) = O(u < a); A(u) = l(u 2 a), 


M (“random,” or Poissonian): 
A(u) =1-eé“", 


With a D-input the customers arrive at regular intervals of time (the inter-ar- 
rival time being fixed and equal to a) while with an /-input the customers arrive 
“at random” (i.e., in a Poisson process). We may also mention an intermediate 
type of input: 
k 

(3) E, (Erlangian): dA(u) = (k/a) e *¥/s a? du, 

T(k) 
which coincides with M when k = 1 and which approaches D as k tends to in- 
finity. The idea of bridging the gap between D and M in this way is derived from 
a similar device employed by Erlang (see [5]) in connection with the service-time 
distribution. (For a more detailed account of the method in relation to a problem 
in population mathematics see [7] and [9]. An extension due to Jensen and Palm 
is discussed in [5].) If the original input is Poissonian and if it is filtered in such 
a way that only every kth customer is admitted to the system then the net input 
will be of type / (the mean inter-arrival time being increased to ka); this remark 
is of interest in connection with a certain cyclic rule of queue-discipline different 
from the one to be considered here. In all three special cases it will be noted that 
E(u) = a; 1 shall give a this meaning in the general case also and I shall suppose 
throughout that 0 <a< ~, 

So much for the input. Under the heading of queue-discipline it will be sup- 
posed that the customers form up into a single queue in the usual way and that 
the customer at the head of the queue is served as soon as a server is free to at- 
tend to him. In general there will be s servers and s will not necessarily be equal 
to unity. 

The service-mechanism will be defined by the assertions that the service-times 

- Up, Ura, *** Of the successive customers are statistically independent of 
one another and of the input (thus the presence of a long queue is here supposed 
to have no effect on the speed of service), and that for all customers (irrespective 
of the identity of the server) the service-time has the (arbitrary) distribution 
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dB(v)(0 Sv < ~). The distributions dA (u) and dB(v) will be classified in exactly 
the same way, with the aid of the symbols G, D, M and FE, , and it will be sup- 
posed throughout that 0 < E(v) = b < «.The symbol G denotes that no special 
assumption is made about dB(v). With a D-distribution for the service-time each 
customer is served for exactly the same length of time b. With an M-distribution 
the service-times follow the negative-exponential law found to hold for unre- 
stricted telephone conversations. Once again the /, distribution is of an inter- 
mediate form. 

With these conventions a particular type of queuing-system can be identified 
by giving it a label such as D/G/3 (regular arrivals; no special assumption about 
the service-time distribution; three servers). Table 1 summarizes the principal 
contributions to the subject and shows where accounts of the various types of 
queuing-system are to be found. 


TABLE |! 
Analysis of the literature on the theory of queues* 
Author (date) Systems discussed References 


Erlang (1908-29)....... M/M/s, M/D/s, M/E,/\ [5] (see also [11] and 


[13]}) 
Pollaczek (1930) . .| E,/G 
Khintchine (1932).....  41/G, 


(14) 
[10] 


Volberg (1939)........ F/G 
Kendall (1951)........ M/G [8] 
Lindley (1952)... ... GI/G/1 [12] 
Pollaezek (1952) .., GI/G/1 {18} 
Smith (1952) GI/G/\ [19] 
Kendall (1952)...... GI/M/s This paper. 


(20] 


‘I 
‘I 
Pollaczek (1934). . . M/G/s [15] 
‘I 
/1 


Except in the case 1 /M/s the stochastic processes associated with the fluctua- 
tions in queue-size are non-Markovian and special methods are required for their 
analysis. In [8] I examined the system M//G/1 by considering the behavior of a 
certain imbedded Markov chain and in this way obtained the distribution of 
queue-size in statistical equilibrium (a result originally found by Pollaczek [14] 
and Khintchine [10], each using quite different methods), and I discussed the 
ergodic properties of the system in relation to the value of the relative traffic 
intensity p = b/a. (“‘Relative,” because by a generally accepted convention it is 
measured in relation to the capacity of the system. When calculated in this way 
p is said to be expressed in erlangs, the erlang being the international unit of 
telephone traffic.) 


2 See also footnote 3. 
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tecently Lindley [12] has formulated and in several important cases solved an 
integral equation of the Wiener-Hopf type for the waiting-time distribution in 
statistical equilibrium when the queuing-system is of the more general type 
G1I/G/\ (special attention being paid to the systems D/E;,/1 which Bailey and 
Welch [1], [2] have shown to be of importance in the design of appointment sys- 
tems in hospital outpatient departments). (Lindley has given detailed solutions 
for the systems M/G/1, E,/G/1 and D/E,/1. For an independent treatment of 
the systems G//G/1 see Pollaczek [18].) In continuation of Lindley’s work W. L. 
Smith [19] has considered several other single-server systems, including those of 
the type GI/M/1, in similar detail. 

I shall show here that the work of Lindley and Smith can also be regarded as 
an application of the method of the imbedded Markov chain, and I shall then 
apply the “imbedding” method to analyze the properties of the many-server 
system G//M/s. It was observed by Smith in his study of GI/M/1 that the as- 
sumption of a negative-exponential service-time distribution leads to solutions of 
a very simple form, whatever the (general independent) input; it will be seen 
here that the same is true even when we allow a general number of servers. 

In [8] I examined the ergodic behavior of the Markov chain imbedded in 
M/G/\ with the aid of Feller’s theory of recurrent events; there are three quite 
different types of behavior when p < 1, when p = 1 and when p > 1. For the 
many-server system it has been observed by Pollaczek that the appropriate 
definition of the relative traffic-intensity is p = b/(sa). I shall assume here that 
Pollaczek’s p is less than unity; with this assumption it will be shown that a 


stable equilibrium exists and the associated equilibrium distributions will be 
determined. (Dr. F. G. Foster has considered the dependence of the qualitative 
behavior of the system on the parameter p; his results are given elsewhere in this 
issue, (F. G. Foster, “On the stochastic matrices associated with certain queuing 
processes,” Ann. Math. Stat., Vol. 24 (1953)).) 


3. The imbedded Markov chain. Let the state of a stochastic system at time ¢ 
be denoted by X(t), so that (in any actual realization) the history of the system 
can be represented as a function X(-) of the time with domain (— ~, «). (In 
the applications which follow it will be an integer-valued step-function defined 
to be continuous-to-the-right at its points of discontinuity.) Let 2, denote the 
set whose elements are the functions having as domain the time-interval (— ~, t] 
and having the same range as X(-). For each t in (— ~, ©) let ©, be a specified 
subset of 2, , and corresponding to any actual realization of the process let II be 
the set of those values of tin (— «, «) for which 6, contains as an element the 
contraction of X(-) to the reduced domain (— ~, ¢]. Let 


Y(t) =f.A{X(r): 7 S 2} for te Il, 


where f; is some specified functional with domain 6; . Now suppose that {0,;, f; : 
—x* <1< «} have been chosen in such a way that 
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(i) TI almost certainly has no finite point of accumulation. (We then write its 
members in increasing order as --- , tn_1, tn, nan, *** +) 

(ii) If Y,, denotes Y(t,,) for each tm € I, then distribution {Yn4.| Yn, Ya-i, 

| = distribution | Y,4: | Yn} for all n. 
The variables --- , Yai, Yn, Yuu, -+-+ will then be said to constitute an im- 
bedded Markov chain. 

Such an imbedded chain can always be constructed, at least in a trivial way. 
Thus we could choose 6; = Q, for each integer t, and require 0, to be void for all 
other values of ¢. Il would then be the set of integers, and by taking f; = 1 we 
would obtain an imbedded Markov chain. In practice, however, three conditions 
must be satisfied if the procedure is to be of any value. First, the system must be 
simple enough to permit a mathematical formulation of the present heuristics. 
(The abstract formulation employed in this and the preceding paragraph must 
not mislead the reader into thinking that it would be a simple matter to imple- 
ment the program envisaged here in complete generality. Grave difficulties of 
definition would be encountered at the outset. The remarks in the present section 
of the paper are offered only as a guide to intuitive thinking.) Secondly, for Y to 
be useful as a reduced state-description, the functional f; must be sufficiently 
and suitably sensitive to variations in its argument. Thirdly, the stochastic 
mechanism governing the transition from one instant in II to the next must be 
simple enough to permit the calculation of the transition-probabilities associa- 
ted with distribution {Y,4; | Ya}. 

The stochastic processes with which we shall be concerned all have a Markovian 


origin and they are deprived of their Markovian character only because we are 
unwilling to work with a sufficiently comprehensive description of the present 
state. One way of remedying this difficulty would be to augment the description 
of the present state so as to imbed the given process in a more complicated one 
having the Markov property. To illustrate this procedure we might consider 
taking the state Z(t) of the augmented process to be the whole past history of 
the given process: 


Z(t) = the contraction of X(-) to the reduced domain (— ~, {]. 


> 


However, certain difficulties are to be expected in defining the Z-process satis- 
factorily, and it is fortunate that such a drastic procedure is not necessary in 
queuing theory, where in the worst case (GJ/G/s) it would be enough to replace 
the single initial state-variable (queue-size) by a vector variable of s + 2 com- 
ponents (the extra components specifying the expended service-times of the 
people being served and the expended inter-arrival time). This particular form 
of the ‘augmentation’ technique can be carried through in some simple cases 
but only at the expense of very complicated calculations, and the method of 
contraction to an imbedded Markov chain is usually preferable even although 
by its very nature it must leave some of our questions unanswered. 

I shall illustrate the ‘contraction’? method by referring briefly to my earlier 
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treatment of M/G/1 and to Lindley’s treatment of G//G/1. I shall then apply 
it to obtain the equilibrium behavior of GI/M/s (a queuing process which seems 
never to have been treated before except in some special cases).° 

4. The system M/G/1. Let q¢ be the number of people waiting or being served 
at time t; then X(t) = q does not constitute a Markov process (except in the 
special case M/M/1). The augmentation technique would not here be too diffi- 
cult; it would suffice to take Z(t) = (q, vo) where vo is the expended service-time 
of the person being served (vo being left undefined when g = 0). The contraction 
technique proceeds as follows (full details will be found in [8]; I quote here a few 
illustrative results only). We define X(-) to be continuous-to-the-right at its 
points of discontinuity, and we take the test for membership of the set 6, to be 
X(t) = X(t — 0) — 1, (‘the value of g has just decreased by unity’’); thus the 
set II consists of the epochs of departure. When ¢ € Il, let Y(t) = X(t) = q, so 
that Y is the number of persons left behind by a departing customer (including the 
person, if any, whose service is just starting). The fact that the input is of the 
Poisson type (i.e., the M in M/G/1) ensures that the Y-chain is Markovian. 

Let q’ and q” be the numbers of persons left behind by two consecutively de- 
parting customers and let 


pis = prig” =j\q = tj. 


Then P = || p,;|| is the transition-matrix for the imbedded Markov chain, and 
we can proceed with its analysis by the methods described in Feller’s book {6}. 


It is found that the matrix P is 


where 


(4) k=l [ ere (£) Bw) 


3Qn my communicating a MS. copy of the present paper to M. Pollaczek, he informed 
me that he has recently considered the system G//M/s from a different point of view, a 
number of his results being qualitatively equivalent with some of mine. His paper has 
since appeared [Pollaczek, 1953] and contains an attack on the problem of the more general 
system GI/G/s. It seems that manageable solutions may be expected whenever the Laplace 
transform of the v-distribution is rational. 
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and the chain is in all cases irreducible and aperiodic. It is ergodic when p < 1, 
recurrent-but-null when p = 1 and transient when p > 1, and in the ergodic 
case the limiting q-distribution is that generated by the function 


H(z) =(1—p 


where A(z) = Dok jz’. From this the Pollaczek formula for the Laplace trans- 
form of the waiting-time distribution follows easily on noting that q is the number 
of arrivals during the sum of the departing customer’s waiting-time and his 
service-time. In particular the probability of not having to wait is found to be 
1 — p, and the ratio of the mean waiting-time to the mean service-time is 
given by 


+ E(w) p ( (:)\ 
)  < — var (”\\ 
(6) E(@) so , + var bf? 


a formula of great practical importance. 


5. The system GJ/G/1. Let q and X(t) be defined as in Section 4 but now 
let the test for membership of the set 6, be the requirement that either 


X ((— 0) =0 and X(t) = 1 


X(t-—0) 22 and X(t) = X(t -— 0) - 1 


(“the service of a customer has just commenced”); thus the set II will consist of 
the epochs of commencement of service. If t ¢ Il then we take Y(t) to be the waiting- 
time w of the customer whose service has just commenced (that is, the time since his 
arrival); of course Y(t) may be zero. The value of Y(¢) can be found by examining 
the graph of g against ¢, starting at some epoch when qg was equal to zero; thus 
Y(t) is a rather complicated functional of the contraction of X(-) to the domain 
(—~, ¢], and it is this functional which is f, . A little intuitive consideration will 
show that the Y-chain is Markovian, although (in contrast to the previous ex- 
ample) it has a continuum of states. (If w, is given, then the prediction of w,41 is 
equivalent to the prediction of v, — u, (formula (7) below). Now information 
about w,_;, W,-2,-** would be of no assistance in making this prediction.) It 
should be emphasized that the Y-chain would not be Markovian if the input 
were not of the special type indicated by the symbol G/. The determination of 
distr | ¥,41} Yn} here depends on the fact that 


(7) Ws, = max {w, + v, — u,, 0} 


So 


where w, is the waiting-time of the customer whose service-time is v, , and 1, is 
the time which elapses between the arrival of this and of the next customer. 
‘The simple matrix relations of Section 4 are here replaced by an integral equation 
of the Wiener-Hopf type due to D. V. Lindley [12], (see also Smith [19}). 
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6. The many-server system GJ/M/s. Here q is the number of persons waiting 
in the queue or being served at one of the s service-points at the time ¢, and 
X(t) = g. The system is Markovian only in the special case M/M/s treated by 
Krlang [5], Molina [13] and Kolmogorov [11]. Let the test for membership of the 
set 0, be X(t) = X(t — 0) + 1, (‘‘a customer has just arrived’’); then the set 1] 
will consist of the epochs of arrival. If t ¢ II let Y(t) = X(t — 0), so that Y is the 
number of persons found to be ahead of him (waiting, or being served) by the newly 
arrived customer. As before, a little consideration will show that (because of the 
negative-exponential service-times) the Y-chain is Markovian. It will be con- 
venient to use q for the value of Y but it should be borne in mind that it is not 
gq but Q = max (gq — s, 0) which is the length of the queue in the ordinary sense 
of the word; if 0 S q S s then q of the service-points will be occupied and s — g 
will be free and no one will be waiting. 

We commence with an examination of the general form of the matrix P whose 
(7, j)th element is pi; = pr{qg” = 74’ = t}, where q’ and q” refer to two con- 
secutive epochs of arrival. The best way to describe the matrix P is first to par- 
tition it as follows: 


A: 
(8) P= Bai 


Ilere A is a square matrix of s rows and s columns; in describing the elements 
of A, B and C they will always be given the labels which they bear in virtue of 
position in P; thus the top left-hand element of C will be called co,, and not ¢, . 

Consider the (i, 7)th element of the matrix A. We must have i S s — 1, and 
so the newly arriving customer will find s — 1 or fewer customers ahead of him, 
all of whom are being served. There will therefore be at least one server free and 
so the service-time of the new customer can commence immediately. We have 
to account for the events during the period of time u which elapses between his 
own arrival and the arrival of the next customer. Suppose that m customers con- 
clude their service during this period, and let us write [n | m; u] for the conditional 
probability associated with the stated value of n when m(= 7 + 1) customers 
are being served at the commencement of the period. We can think of these m 
customers as the members of a colony which is subject to a randomly-operating 
death-rate of amount 1/b per head per unit of time. The theory of the simple 
death-process (see, for example, [6]) then gives 


(9) [n | m; uj = ("Ja — ¢*)*e™ 


i 


(a result which can also be obtained directly without much difficulty). If we put 


(10) In|m] = [ [n | m; u] dA(u) 
“0 


then we shall have 


px = (i +1—-j)i+ 1) when jsi+l 
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and 
pi; = O elsewhere, 

provided that (7, 7) ¢ A. Here is the form of A when s = 

{1/1} [0/1] O 0 

(2|2] [1/2] [0|2 0 

{3|3] [2|3] [1/3] [0| 3) 

[4/4] [3]4] [2]4] [14] 
It is important to notice that the elements of A are positive when j S i + 1. 

The elements of B are less simple in form. Here i 2 s, and so immediately 

after the commencement of the inter-arrival time u there will be m = i — s + 1 
persons waiting in addition to the s persons who are being served. Also 7 < s, 
and so at the end of the interval no one can be waiting and n = s — j servers 
must be free. We require the conditional probability {n|s; m; u} associated 
with the stated value of n when m and u are given. Suppose that the last of the 
m waiting customers is received at a service-point after the lapse of a time U. 
The distribution of U can be written down at once because sU/b = 4 xm, and so 


' l at a a \ r r 
) 21S = a fK > 8: ; 
(11 n|s;m;u} Ss Di (5 | é l [n| 8; u U| dl 


If now we put 


(12) in| s; = 1s; m; ut} dA(u) 


then we shall have 
Pi = {[s —jls;i—s+1} when (7,7) € B; 


it is to be noted that all these probabilities will be positive. 

Finally we require the elements of C, and here j 2 s. Obviously we shall have 
pi; = Oif 7 > 7 + 1, because no one can enter the system during the inter-arrival 
interval. Let us write n = i + 1 — 7 for the number of customers whose service 
is concluded during the inter-arrival time, and then seek the conditional prob- 
ability (ms; u) associated with the stated value of nm when uw is given. (The 
notation implies that this probability is independent of the value of 7 (which is 
also to be supposed given). That this is so will be the principal result of the argu- 
ment which follows.) Because 7 2 s for the elements of C we need only consider 
the values of n such that 0 < n S$ i+ 1 — s, and if n has one of these values 
then there will be no service-point unoccupied during or at the end of the inter-arrival 
time u. Thus (n | 8; u) is equal to the probability that, in time w, n incidents will 
be registered in a Poisson process for which the expected incident-rate is s/b per 
unit of time. That is, 

—su/b (su/b)” 
e Siadteuioe 


’ 
n! 


(13) (n|s;u) = 
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and so if we write 


2 


(14) (n|\s) = | (n | 8; u) dA(u) 
Jo 


then we shall have 


and 
pi; = O elsewhere, 
provided that (7, 7) e C. Here is the form of C when S = 4: 
0 0 0 
0 0 0 
0 0 0 
0 0 0 
(0/4) O 0 
(1\4) (O| 4) 0 
(2|4) (1{4) (0/4) 


It is to be noted that the diagonal and super-diagonal elements of P which lie 
in C are respectively equal to (1 | s) and to (0 | s) and that both these quantities 
are positive. 


7. The many-server system G//M/s (continued), Now that we know the form 
of the matrix P we can apply Feller’s treatment of denumerable Markov chains 
to obtain the principal properties of the imbedded Markov chain in the prac- 
tically important case when the relative traffic-intensity, p = b/(sa), is less 
than unity. Familiarity with chapter 15 of Feller’s book [6] will be assumed. 

In the first place, the chain is irreducible. This can either be seen analytically, 
or made self-evident by the following intuitive considerations. 

(a) The transition i — 0 always has a positive probability because it can happen 
that all the 7 + 1 customers will be served and will leave the system during the 
inter-arrival period. 

(b) The transition i — 7 + 1 always has a positive probability because it can 
happen that no customer will leave the system during the inter-arrival period. 
Thus it is possible in a suitable (finite) number of steps for the system to move 
from any given state to the zero state and thence to any other given state. 

Accordingly, every state is of the same “type.” That they are all aperiodic 
follows from Feller’s theory and the fact that the diagonal elements of P are 
positive. We shall now show that with the given restriction on the relative traffic- 
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intensity (op < 1), the states are ergodic. This will imply that the probability p/; 
(that the system will be in state j after n steps) converges in the ordinary sense 
(as n tends to infinity) towards the jth element, 7; , of a limiting distribution 
which is independent of the initial state, 7. 

From Feller’s theorems 1 and 2 we know that either (i) every state is transient 
(or every state is recurrent-but-null) and p?; — 0 as n — o for alland J; or 
(ii) every state is ergodic and p};; — 2; as n — for all i andj, where the z’s 
are positive and sum to unity. Suppose that the matrix P could be shown to 
possess a nonnull invariant row-vector x, the components of which form the 
terms in an absolutely convergent series. Then x = xP = xP”, and so in case (i) 
we should have 


x 


r= 7: La Pe; — 0 
av 


and in case (ii) we should have 


2 


Ly = a La Daj — (Sox.)7; asn— 2, 
and because x is supposed not to be null it would then follow that the system 
must be ergodic, that }>z. ~ 0 and that the vector x could be normalized to 
give the limiting distribution, x. Thus, in order to establish ergodicity and de- 
termine the limiting distribution it will be sufficient to construct a vector x having 
the stated properties. (It is important to note that we do not need to worry 
about the signs of the components of x.) 

In order to implement this program, let us write 


(15) X = [wo, mi, ++, Mee, 1,4, Ay, -* | 


(the yu-terms being absent when s = 1). I shall show that A and the u’s can be 
chosen so as to give this vector the required properties (it will be enough if it is 
invariant and if | \ | < 1). The invariance of x requires that 


oe 


= Zz. TaMes O<j< ~) 


a=0 


and when j 2 s each of these equations will be found to be equivalent to the 
following equation for \: 


(16) F(\) = 0 <r <1) 


where 


a) 


(17) FQ) = | eo" du). 


+0 


There is a momentary advantage in writing the \-equation in the form, }>*_, 
(n| s)A” = X, and then noting that the coefficients (n | s) are all positive and are 
the terms in a probability distribution whose mean is equal to 1/p (and so is 
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greater than unity). It is now an immediate consequence of the fundamental 
lemma of branching-process theory (see, e.g., [6]) that the A-equation has a 
unique root in the interval 0 < \ < 1. In what follows the symbol A will always 
denote this root. 

The equations x; = }>rapa; (1 Sj S s — 1) can now be used for the successive 
determination of the y’s (this depends on the fact that [0 | 1], [0 | 2], --- ,[0|s — 
1] are all positive). Thus we have only to verify that the vector x so constructed 
satisfies the last of the invariance conditions, 1 = )>rapao. But this is an im- 
mediate consequence of the fact that the row-sums of the matrix P are all equal 
to unity (the intuitive meaning of the row-sum condition is obvious; its analytical 
verification is tedious but elementary). The following statement summarizes 
the results so far obtained: 

I: when the relative traffic-intensity is less than unity, the Markov chain imbedded 
in GI/M/s is irreducible and ergodic. The limitung distribution is a geometric series 


save for modifications to its first (s — 1) terms, the common ratio being the unique 
root of the equation 


F(A) =A (0<A <1). 


Now i, the number of persons waiting or being served in the system, is not in 
practice so interesting as the “true’’ queue-size encountered by the newly-ar- 
riving customer: 


(18) Q = max (2 — s, 0). 


A random variable of equal importance is the waiting-time of the new customer, 
w. If k = max (¢ — s + 1,0) then w will be the sum of k independent variables 
each distributed like 3bx3/s, and so the Q- and w-distributions are quite easy to 
find once the 7-distribution is known. Thus, in the statistical equilibrium which is 
ultimately attained when the relative traffic-intensity p is less than unity, we 
shall have: 

II: the probability that Q is zero is 


(19) a= wee its 
Du + 1/(1 — d)’ 


and the probwability that w is zero ts 


(20) B= du +1 


Dut i/(l— ad)’ 
III: the mean value of Q is given by 
- l-—a 
and the mean value of w is given by 
E(w) 1-8 


E@) sl —»)’ 
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IV: when Q is known to be positive, its conditional distribution is 
(23) (1 — a?" (Q = 1,2,3,--- 


and when w is known to be positive, its conditional distribution is 


(24) e' dw/e O<w< ~~), 


where 


(25) ee 
s(1 — X) 

The second part of Theorem IV is a generalization of a remarkable result 
recently discovered by W. L. Smith [19]. He observed that for a single server and 
with certain restrictions on the form of the distribution dA (u) defining the general 
independent input, a negative-exponential distribution of service-times produces 
a negative-exponential distribution of waiting-time (apart from a probability- 
concentration at the origin). We now see that the restrictions on the form of the 
distribution dA (u) are unnecessary and that the result is true whatever the num- 
ber of servers. Moreover, this simple property of the waiting-time distribution 
is associated with an equally simple property of the queue-size distribution, 
which we have shown to be of the geometric-series form apart from a probability- 
concentration at Q = 0. 


8. Detailed results for GJ/M/s when s < 3. Suppose first that the inter- 
arrival time wu has a distribution dA(u) = dA,(u/a), where dA,(u) is a fixed 
distribution with a mean equal to unity, so that the average inter-arrival time, 
a, enters dA(u) only as a scale-parameter. Then the (A, p)-relation can be written 


(26) n= f oO?" GA,(z) (<r <)), 
“0 


and so it is independent of @ and of the number of servers, s. Two special cases 
are of interest. 

Poissonian input (system M/M/s). Here dA,;(u) = e “du, and the (A, p)- 
equation is 


(27) Ww — (1+ prA+p=0. 


The root in the interval (0, 1) is \ = p. (The results for the system 1////s are 
well known and are to be found in references [5], [13], [11] and [6].) 

Regular input (system D/M/s). (This system has only been studied before in 
the case s = 1.) Here the (A, p)-equation can most conveniently be put in the 
form 


(28) 1 . fs where X = (1 — A)/p, and O<A <1. 


A few corresponding values of \ and p are given in Table 2. 





THEORY OF QUEUES 351 


In order to complete the study of any particular case when s 2 2 we need 


the values of the u’s. When s = 2 or 3 the procedure is as follows; it will be obvi- 
ous how this is to be extended when s = 4. 


TABLE 2 
The (Xd, p)-relation for the system D/M/s 
p A 

0 ‘ 0.3242 
0.0000 454: av 0.4670 
0.0069 8 0.6286 
0.0408 § a 0.8069 
0.1073 6 F 1 
0.2032 


Determination of uo when s = 2. To simplify the formulae, I shall suppose that 
a = 1; this will not result in any loss of generality. The yo-equation is 


(29) 1 = [0| tuo + (1) 2] + Dd $1) 2; afar’, 
a=l1 


and after a few transformations this becomes 


(30) ~ [ ™ aaty) = — 
- i". ° "Ser 


Thus we have: 
(31) Svstem /M/2: 1/(2p). 
2 ae a (2p 


(32) Syst em D ‘M /2: fo = D> ma 


Determination of uo and yp, when s = 3. The equations determining the y’s are 
(33) 1 = (0) 2lu + [1/3] + Lo 1/3; aa’, 
a=] 
and 


(34) wr = [0] Luo + [1 | 2]. + (2) 3] + 


and after some transformations these become 


: a — sai 
(35) (5 <a my) | ’ 1A(u) = a 
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and 
26 6 7 —u/3p 4 
(6) (gp — mam) [eran = A — a. 
It is now quite a simple matter to find y» and yw, when dA (w) is given. For example: 
5) 


(3p)? 


(37) System M|M|3: 7 3, and po = 


System D| M | 3: if x is the positive root of x” = e, 
then 


3 —« 


a at 


and 


3 — 2r 
(38) Ho = 2 (G—) — (2 — z)m. 


The above formulae have been used in the construction of Table 3, which shows 
the effect of varying the quality and intensity of the input and the number of 
servers when the service-time has a negative-exponential distribution. For a 
specified relative traffic-intensity, p, Table 3 gives 

(a) the probability of not having to wait, and 

(b) the ratio of the mean waiting-time to the mean service-time. ‘‘Random”’ ar- 
rivals and “Regular” arrivals refer to the systems M/M/s and D/ M/s, respectively. 


TABLE 3 


The many-server queuing-system with a negative-exponential service-time 


! ! 
One server Two servers Three servers 


Relative | 


trafiic- Random arrivals Regular arrivals | Random arrivals Regular arrivals [Random arrivals Regular arrivals 
intensity p 


e | 


(b) (a b 1 (b) (a) 5 (a (b) ( (d) 


S 


1.00 0.00 
1.00 | 0.00 
.98 
.93 | 0.03 
86 
76 
65 
51 
0.35 
0.18 


0.00 0.00 


2 2 | 
ee 


0.00 


SSsess 
euercoooe 


S 


J 
— 
— 
so 


oosoo- 
1S 


Lam ammo | 
So 


—— 
Sas2zs 
won @ 


© 
w 
So 
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ON THE STOCHASTIC MATRICES ASSOCIATED WITH 
CERTAIN QUEUING PROCESSES 


By F. G. Foster 
London School of Economics 


1. Summary and introduction. We shall be concerned with an irreducible 
Markov chain, which we shall call “the system.” For simplicity we shall assume 
that the system is aperiodic, but this is not essential. The reader is referred to 
[1] for explanations of the terminology used. We first state some general theorems 
which provide criteria for determining whether the system is /ransient, recurrent- 
null or ergodic (recurrent-nonnull). These are then applied to the Markov chains 
associated with certain queuing processes recently studied by D. G. Kendall 
[4], [5]; most of the results have already been obtained by Kendall using direct 
methods, and the main purpose of the present paper is to illustrate the applica- 
tion of general theorems to this type of problem. 


2. Let [p,;] (7,7 = 0,1, 2,---) be the infinite stochastic matrix of the system, 
and denote by [p‘;/] its nth power. 

THreorem 1. The system is ergodic if there exists a nonnull solution of the equa- 
tions 


(1) > tipi = I; J 0, .. ae see) 


t=O 


such that >> | a;| < ~%; and only if this property is possessed by any nonnegative 
solution of the inequalities 
7m 


(2) Dd i Dis = qj Gj = 0, 1,2,--- 


t=O 


(n) 


Proor. It is known (cf. [1]) that lim,.., pij’ = 7; always exists and is inde- 
pendent of 7; and further that either x; > 0 for all j or r; = 0. The system is 
ergodic if and only if x, > 0. For any nonnull absolutely convergent solution 
{z;} of (1) 

ue 
(3) > zp} = 2; (j = 0,1,2, ---) 


i=—(0 


for all n, and so 


x 
(4) > a"; = 2; ) ES sox}. 
t=0 

‘Therefore z; > 0 (for otherwise the solution would be null), and so the system 
is ergodic. 

Conversely, suppose the system to be ergodic, so that x; > 0. Let {2x;} be 
any nonnegative solution of (2). Then we have also (3) and (4) with inequalities. 
Therefore >> 2; < x. 


renee ieee 
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THEOREM 2. The system is ergodic if there exists a nonnegative solution of the 
inequalities 


(5) DL piyi S yi — 1, i £0, 


j=0 


such that > Fo Pos; < ow, 
Proor. Let 2 Posy; = Define 


2 

(n+1) n(n) i 

Ys > Ps Yi» Ys — pe 
j=6 


Then 


yo = DD pis pays = DP ys? sd? + 2 (yj — Ips; 
é 


j=) k=l j=l 


(n 


< (1 +A)pir + ys?” — 1. 


° “2 (n) « . (n+) : . ( ° i ea ° ° 
It follows that if y;" is finite, then y;""” is finite. But y{” is finite for all i. 
Ww . (nm) - se . ° ’ ° ° 
lherefore y;" is finite for all 7 and n. We have now a recurrence relation from 
which we obtain 


) 


Str Dvd + yf? — 2. 


r=] 


Therefore, 
1 n+2 . 1 (r) 1 2) 
n'y? SL +A De pe +n ty? - 
r=] 


Letting n — ~, we have 0 S (1 + A)my — 1. Therefore, m = (1 + A)" > O, 
and so the system is ergodic. 

The corresponding necessary condition can be given the following sharper 
form. 

THEOREM 3. If the system ts ergodic, then the (finite) mean first-passage times, 
d; , from the jth to the zero state satisfy the equations 


wo 


(6) > pid; = d; — 1, i ~0, 


j=l 
and 


> Po; d; < *. 
j=l 


(For a proof see [1], p. 335.) 

In the following theorems we do not distinguish between recurrent-nonnull 
and recurrent-null systems. 

THEOREM 4. The system is transient if and only if there exists a bounded non- 
constant solution of the equations 


(7) 7, Pui Yi as ps 


3=0 
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Proor. The system will be recurrent if and only if there is probability unity 
that the zero state is eventually attained from any state 7 ~ 0. Consider there- 
fore the modified system in which the zero state is made completely absorbing. 
Denote the modified stochastic matrix by [p;;]. [Thus po = 1, p:; = pis, i ¥ 0.) 
Then we have lim, pi5". = ™; , Where r;; = 0,7 ~ 0, and iis the probability 
that the zero state is eventually attained from the ith state. Now 075 pi m9 = 
mo for all 7. If the original system is transient, +/) < 1 for some 7, and in all cases 
mo = 1. Therefore, defining y; = m4, we have a bounded nonconstant solution 
for (7). 

Conversely, suppose a bounded nonconstant solution of (7) to exist. Since 
(for any constants a, 8){a + By;} is also a solution, we may suppose without loss 
of generality that yo = 1,0 < y; S 2 for all 7. We have 


(8) Zs Dis U3 = Ys for all 7, 
j=0 


so that )>%.0 pj" y; = ys for all i, n. Therefore, letting n —> ©, we obtain 
Ze 7.3; < y: for all i, and hence mo S y; for all 7. But we must have either 
vi < lor y, > 1 for some 7. In the former case we have r.o < 1. In the latter 
case by considering the solution {2 — y;} we reach the same conclusion. 

(A different proof of essentially this theorem has been given by Feller ({1], p. 
334). The above proof is included since the technique is required for the following 
two theorems.) 

THEOREM 5. The system is recurrent if there exists a solution {y;} of the inequal- 
ities 


(9) D Pius S Yi i #0, 
j=0 


such that y; > © asi— ~, 

Proor. Using the same ‘‘matrix modification” technique as in Theorem 4, 
we have (8) with inequalities, and we may assume without loss of generality 
that y; = 0 for all 7. A proof that (> is now identically equal to unity has been 
given by Kendall [3]. 

The condition given in Theorem 5 would appear to be necessary for recurrence 
only under certain additional assumptions. (Cf. Foster [2].) 

The following variant of Theorem 4 is sometimes useful. 

THEOREM 6. The system ts transient if and only if there exists a bounded solution 
{yi} of the inequalities (9) such that y; < yo for some t. 

Proor. The proof of the necessity is as in Theorem 4. To prove the sufficiency, 
we have as before, assuming %/ = 1, ro S yi, so that rio < 1 for some i. 

It may be noted that since the numbering of the states is conventional, there 
is no particular virtue in selecting the zero state for its special rdle in the above 
theorems. 

In the sequel we shall require the following familiar lemma from branching- 
process theory. (For a proof of it see e.g., [1], p. 226.) 
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THEOREM 7. Given that |p,} (n = 0, 1, 2, ---) ts a probability distribution with 
Po > O, the equation 


=x 


n 
ZDn = 2 


n= 
possesses a rool ¢ in the range O < — < 1 if and only if SCY np, > 1. 


3. The queuing system M/G/1. (For an explanation of this labelling see 
[5].) The associated stochastic matrix has the form 


ko ky ke 
ko 

0 

0 


in which k; > 0 for all 7. Define p = 0? nk, . We shall prove that the system 
is ergodic if and only if p < 1; and that it is recurrent if and only if p < 1. 

Suppose first that p < 1. Define y, = j(1 — p)’. We find that {y,;} satisfies 
the conditions of Theorem 2, and so the system is ergodic. 

Conversely, suppose the system to be ergodic. From the structure of the 
matrix it will be clear that if u;; is the mean first-passage time from the 7th to 
the jth state then 

Mii-1 = M10 (¢ ~ 0). 
Moreover, from the 7th state the zero state can be attained only via the (7 — 1)st 
state. Therefore 
Mo > Mi,i-1 + Ms—1,.0- 


It follows by an induction that 


Ko = tm10 


Therefore, using Theorem 3, we have 


x 
sid, Pri J = po — 1. 
1 


Therefore 


so that 
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We turn now to the problem of recurrence. If p < 1, defining y; = j, we find 
that the inequalities (9) are satisfied, and y; —~ ~ as j — «. Therefore by 
Theorem 5 the system is recurrent. (We have here merely reproduced the method 
employed by Kendall [4].) 

Conversely, suppose p > 1. By Theorem 7 the equation }—7 z"k, = z has a 
root ¢ in the range 0 < — < 1. Define y; = &’. We find that the equations (7) 
are satisfied and y; ~ 0 as j —~ ~, with y = 1. Therefore by Theorem 4, the 
system is transient. 


4. The queuing system G/J/M/1. The associated stochastic matrix has the 
form 


a O O 
a a O 


a GQ A 


where the elements a, , a, [=)-?,: a;] are all positive and }> a, = 1. 

(The more general system GI/M/s studied in [5] would necessitate only 
trivial alterations to the treatment given below.) Define p by p| = ay Na, . 
We shall prove that the system is ergodic if and only if p < 1; and that it is re- 
current if and only if p < 1. 

Suppose first that p < 1. By Theorem 7, the equation 


(10) > 2"a, = 2 
0 


- 


has a root £ in the range 0 < & < 1. Define x; = &'. We find that equations (1) 
are satisfied and we have >> 2; < 2%. Therefore the system is ergodic. On the 
other hand, if p 2 1, 7; = 1 isa solution for (2). But in this case > x; is infinite, 
and so by Theorem | the system cannot be ergodic. (The sufficiency only of the 
condition p < 1 was previously proved by Kendall [5] using this method.) 

We turn now to the problem of recurrence and consider the possible solutions 
of the equations (7). We may without loss of generality suppose 7% = 0. Define 


Y(z) = > Yn i. 
A(z) = > a@,2°. 


whenever these power-series converge. We find that 


V(z) = agpysz{ A(z) — 2}7" 
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y. being arbitrary. By Theorem 7, if p < 1 the right-hand side has a singularity 
at some point z = £,0 < — < 1. Therefore the sequence {y,} cannot be bounded 
and nonconstant, and so, by Theorem 4, the system is recurrent. 

We now consider p = 1. Write 


A(z) — z = (1 —2) i! ao a 


Following Kendall ([4], p. 159), we have 


@ 


= A(z) = > 2" a Mat 


5 il = Zz n=0 i=] 
Therefore for | z| < 1, 
11 — A(z) | 
AA!) <i si, 
| l—2z 


and so | A(z) — z| # 0. It follows that the power-series expansion, 


(1 — 2) {A (z) — :) = . baz” 


is valid for | z| < 1. It will further be observed that the coefficients b, are non- 
negative, and so, by Abel’s theorem, 


db wie 1. 


( 2 \ ra 
} n 
aoyiz) Lez ‘4 


< oO b,2” > 
Vo J VG 
oo n 
= ay 2. ” a b,. 


n=O i=0 
Therefore ynsi1 = Get aoe b; . Therefore the sequence {y,} is bounded and non- 
constant if and only if p > 1. Therefore, by Theorem 4, the system is recurrent 
if and only if p S 1. 
I am indebted to Mr. Kendall for showing me a copy of his paper [5] prior to 
publication, and for valuable criticism in preparing the present paper. 
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BOUNDS ON A DISTRIBUTION FUNCTION WHEN ITS FIRST 
n MOMENTS ARE GIVEN' 


By H. L. Roypren 
Stanford University 


Introduction. Let F(X) be a nondecreasing function defined on the real line 
with F(— «) = 0 and 


[ t' dF(t) = M, k = 0, --: , Qn. 


ax 


Then the problem of Tchebycheff is to find upper and lower bounds for F(X). 
If X is a random variable with the cumulative distribution function F(X), this 
is just the problem of determining the (sharp) upper and lower bounds for Pr. 
(X < d). This problem has been solved by Markoff [2] and Stieltjes [5] and their 
results are given in Section 1. 

It is often of interest, however, to determine upper and lower bounds for 
Pr. (| X | < d). This is the problem of determining upper and lower bounds 
on the cumulative distribution function F*(X*) = F(X*) — F(—X*) of the non- 
negative random variable X* = | X |, and leads to the Stieltjes moment problem: 
To determine upper and lower bounds on the nondecreasing function F* given 


[ t* dF*(t) = M, k=0,--- ,n; 


and F*(0—) = 0. The numbers WV, are now the absolute moments of X, that is 
M, = E(| X |"). It should be noted that the set of moments. 
Ma = E(| X |“) k=0,---,n 


, 
serves just as well as My, --- , M,., since they are the first n algebraic moments 
of the nonnegative random variable Y = | X |* with the cumulative distribution 
function G(Y) = F(Y"%). 

In the second and third section we give a solution to this problem which cor- 
responds to the classical Tchebycheff inequalities for the Hamburger moment 
problem, and apply these general results in the next section to obtain the Cantelli 
inequalities. I would like to point out that Theorems | and 2 can be derived from 
very general results of Krein [9]. However, the self-contained approach used 
here seems to me desirable in view of the complexity and inaccessibility of Krein’s 
results. In the last section we solve the problem of determining sharp upper and 
lower bounds for a distribution given the first two (absolute) moments about the 
mode. 


1. The Tchebycheff inequalities. \ point ¢ is said to belong to the spectrum 
of the random variable X or of the corresponding distribution function F(X) 


Received 10/20/52. 
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if there is no interval about ¢ in which F(X) is constant. A random variable (dis- 
tribution function) is called arithmetic if it is continuous on the right and its 
spectrum is a finite set of points. We say that a distribution function belongs to 
a set of moments if it has these moments. By the mass at ¢ we mean the quantity 
F(t+) — F(t-). 

For the sake of completeness we include the method for determining the 
(sharp) upper and lower bounds for the distribution function F(d), that is, Pr. 
(x S d), when we are given the first 2k algebraic moments of F(x) about the 
origin. A distribution function belonging to these moments is said to be charac- 
teristic for d if it is arithmetic and its spectrum contains fewer than k points 
other than d. Then we have the following propositions. 

Proposition 1. Given the moments Moy, --- , Me, of some distribution fune- 
tion F(x), and a real number d, there is a distribution function F(x) belonging 
to them which is characteristic for d. 

Proposition 2. We have F.(d—) < F(d) S F.(d). 

Proposition 3. The points x; at which F, increases are the zeros of the poly- 
nomial 


M, - My-1 
M, - M 
Q(z) = Qar(x) = 


oe Maas 


The mass m;, of Fg at x; is given by 


(1) mi = 7. ci? M, 


l=0 

where 

is ~ i x — Z;) 

Q(z) = Vef?z' = I] (2 a 
i=0 jy (2; = xj) 
(If Q has no multiple zeros we have Q(x) = (Q(x))/(Q'(x)(x — 2,)).) 
For a proof the reader may either refer to ({4], p. 43) or construct one analogous 

to that given here in Sections 2 and 3. 


2. The Stieltjes case. In this section we consider the sharpening of the fore- 
going inequalities which is possible in the case of a positive random variable 
Let My, M,,---, M,, be the first n moments of a positive random variable 


NX with the cumulative distribution function F(X), that is, Wy = i t’ dF(t). 


Then it is known [4] that the determinants A, = | W,., ‘jac and Aokay 


re 


M34 )+1 “a satisfy 
A, > 0, 
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unless the moments {.M,} belong to an arithmetic distribution function having 
mass at n/2 or fewer points, counting the point zero with multiplicity one-half. 
In this latter case, that is, when A, = 0, the distribution function having these 
moments is unique, and the upper and lower bounds are trivial. Consequently, 
we shall assume that the A, are all positive. 

The set of distribution functions which have moments of order n may be 
considered as a set of positive linear functionals on the space Ni consisting of 
those continuous functions ¢ on [0, «] for which lim,_.,. ¢(t)/t” exists. 

We say that a sequence of distribution functions F; converges to the distribu- 

« © 
tion functions F if lim | gdF; = I ¢dF for all ge N. It follows at once 
S o- 


0 
from the definition of the Stieltjes integral that every distribution function is 
the limit of arithmetic distribution functions. 


Let } = S(n, M) be the set of arithmetic distribution functions which have 
mass on at most n + 1 points and for which I t" dF(t) < M. We shall find 


it convenient to include in the positive linear functionals on N defined by 


i t 

Li{¢e] = d tim 2 A> 0. 
t—re 3 

We say that L arises by placing an “infinitesimal” mass X at infinity. Thus if 

F is an arithmetic distribution function having masses m; at x; and an infinitesi- 

mal mass X at infinity we write 


I g dF = ) mg(x,;) + A lim eit) 


. 
n 


Lemma 1. The set 5 is sequentially compact. 
.: +(j) 
Proor. Given a sequence F’” ¢5, we may choose a subsequence such that 
° (j) (5) ° 
(a) each spectral point 2;’’ of F’’’ converges to some point x; ¢[0, «]. (b) If 
a; # «, the masses m{” which F” has at x)” converge to some number m, . 
2 
(c) The integrals J; = | t" dF‘”’ converge to some number J. 
0 


Any function ¢ ¢ 1 may be written as g¢ = go + at" where lim;.. go(t)/t" = 0 
and lim;.. ¢(t)/t” = a. Given e > 0, we may choose 3 so large that 

¢o(t) 

—— <e¢/M for all ‘2 i. 


n 


Consequently, 


|; ¢o(t) dF” < «/M i rar 2% 
Since gp is uniformly continuous on [0, 3], we may choose j so large that 


dD mgo(xs) — mi go(x{?) | <. 


1 zs<5 
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axe 


¢o(t) dF (t) = = Migo(ts) | < Ze 
o~ 


= zi<g 


whence, taking j so large that | J — I;| < «, 


so 
[ ext) dF) — XS megrlx) — al < 3¢. 
“0-— zi< 
Since e was arbitrary and 3 > » ase—0, 


lim | g(t) dF = 7. migo(r;) + al. 
0o— 


a% 
° n (9) 7a. . 3 + ( f n 
I = lim t” dF’? (t) = lim | t’ dF’ (t) = zz m,(x;)”. 
0— “0— zis 
Since 3 is arbitrary J = z m(x;)", whence X = I — z. m(x;)" = 0. Since 
¢o(x;) = go(x;) + ax? , we have 


2 


. ° 
lim | g(t) dF” = > m;o(2;) + at = [ g(t) dF(t). 
0— ¥0— 
where F is that distribution function in ¥ which has mass m; at x; and an in- 
finitesimal mass } at infinity. Thus F°’ — F proving the lemma. 

LemMa 2. Jf {M,,---, M,} are the moments of some distribution function, 
they are the moments of a distribution in | = F(n, M), with M, < M. 

Proor. Every distribution function with n moments is the limit of arithmetic 
distributions by the definition of the Stieltjes integral. Since F is sequentially 
compact and the moments are continuous, it suffices to show that if M,, --- 
M, are the moments of an arithmetic distribution, they are the moments of a 
distribution F. Let A be the region in n-dimensional Euclidean space whose 
points are moments of arithmetic distributions. Then A is the convex hull of 
the curve 


‘M, = P= 1,-+-,n; OSt 


Thus every point of A must be in the simplex determined by some n + 1 points 
of C, that is, 


M,; = = m,(t;)*; m; = 0; > m; = 1. 


But this is just the statement that the 7; are the moments of the arithmetic 
distribution F which has mass m; at ¢; , and this belongs to F¥ as soon as M, < M. 
LemMaA 3. The mass m = m(d), which a distribution F has at a given fixed point 


dis upper semi-continuous as a function of F. 
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Proor. Let F” with mass m at d converge to F with mass m at d. Let ¢ 
be a positive continuous function belonging to 91, which is one at d and vanishes 
at the other spectral points of F. Then 


oo cs] 
m= / g dF = lim gaff? = a”, 
Jo— o~ 
as was to be proved. 

Derrmition. Let M,, M,,---, M,, be the Stieltjes moments of some dis- 
tribution function. An arithmetic distribution function belonging to this set is 
said to be characteristic for d if its spectrum contains at most n/2 points different 
from d where zero and infinity are counted with multiplicity one-half. 

THeoreM 1. Jf My,---, M, are the Stieltjes moments of some distribution 
function and d is a positive number, then there is an arithmetic distribution function 
belonging to My , --- , M, which is characteristic for d. 

Proor. By Lemmas 1, 2, and 3, there is an arithmetic distribution function 
F, belonging to My, --- ,.M, which has the largest possible mass mg at the point 
d. Suppose the spectrum of Fy contained more than n/2 points. Then if n is 
odd we must have one of the cases A or B below. 

Case A. Positive masses m; at the k points x; where k = (n + 1)/2. Then 
the Jacobian of the moments M,,--- , M, with respect to changes in m; and 
x; is 


=m--- mI] (x; —- x;)* ¥ 0, 
tx) 


n n n—1 n—1 
Ty * Le NM, -  NMEXE 


and we can increase the mass at d by a small amount and change the m; and 
x, so that the moments M,, --- , M, remain the same. Thus case A is impossible. 

Case B. Positive masses m; at the k points x; , mass m, at zero and infinitesimal 
mass \ at infinity, where k = (n — 1)/2. Here the Jacobian is 


rn a a 


n n—l n—1 
ss NM, 2, * NMETZ, 


=m-:: --- a] (2; — 2)‘ ¥0. 


and Case B is impossible. 
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Using similar arguments in the case of even n, we see that the spectrum of Fa 
can contain at most n/2 points other than d, thus proving the theorem. 

THEOREM 2. Let My, --- , M,, be the Stieltjes moments of a distribution function 
F(x). Let Fa(x) belong to My, +--+ , M, and be characteristic for d. Then Fa(d—) S 
F(d) s F,(d). 

Proor. We consider four cases: 

Case A. Even n and the spectrum of Fz consists of d and k S n/2 other points 
,°°* , a all different from zero and infinity. We construct a polynomial P(x) 
of degree 2k + 1 which satisfies the following conditions: 


P’(z,:) = 0 pa l,---,k 

P(x) 1 z:<d 

P(d) 1 

P(x) =0 a> d. 
Such a polynomial exists, since these conditions are 2k + 1 linear equations for 
the 2k + 1 coefficients of the polynomial and the determinant of this system is 


II (@ — x)° I] (z; - 2)‘ € 0. 
‘ tx) 


It is easily verified that P(x) = 1 for x < d and P(x) 2 O for all x. Since Fa 
has the same first 2/ + 1 moments as I’ we have 


2 2 d d 

F(a) = [ P(x) dF(2) = [ P(x) dF(x) = I P(x) dF(zx) = [ dF(x) = F(d). 
0o- 0 0o~— o-— 

Similarly Fa(d—) < F(d). 

Case B. Even n and the spectrum of Fy consists of d and k = (n 2) — 1 
points other than zero and infinity. We construct a polynomial of degree 2/: + 1 
which satisfies the conditions 

P'(z;) = 

PO) 

P(x;) 

P(d) 

P(z;) = 0 ts >a. 


Such a polynomial exists as before, and also P(x) 2 O for x 2 d, and P(x) 2 1 


for 0 Ss x S d. Hence, since P is of degree less than n, we have 


2 20 d d 
F,(d) = [ P(x) dF,(x) = [ P(x) dF(x) = | P(x) dF(x) = i dF(x) = F(d). 
o Jo- o— o~ 


Similarly Fa(d—) < F(d). 
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Case C. Odd n and the spectrum of F, consists (possibly) of 0, d and k s 
(n — 1)/2 other points all different from infinity. Construct P(x) of degree 
2k + 1 such that 


P'(x;) 

P(0) 

P(z;) = a<d 

P(d) l 

P(2;) 0 x; > d. 
Such a polynomial exists as before and P(x) 2 0 for z 2 d and P(x) 2 1 for 


0 < « < d. Hence 


Fd) = [ P(z) dF,(2) = | P(z) dF(z) = F(d), 


) 
and similarly Fa(d—) Ss F(d). 

Case D. Odd n and the spectrum of F, consists (possibly) of d, infinity and 
kk S (n — 1)/2 other points all different from zero. Construct P(x) of degree 
2 such that 

P'(z,) = 0 
P(z;) = 1 
P(d) I 


P(z;) =0 a; <d- 


Here P(x) 2 1 for zx S dand P(x) 2 0. Hence F4(d) 2 F(d) and F.(d—) S F(d) 


as before. 


This plethora of cases establishes the theorem. 

3. Determination of Fg. 

Lemma 4. Let n be greater than or equal to 2k, and let the roots of 
4 Mo + Min 
” M 
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>, S r,. Let the roots of 


6 M, . M;, 
e)6—OhoM 


Oo Min - Ma 


be SA85--- S&.ThnO=& <r < O < +++ de < &, and the poly- 
nomial 


Mo +: Mi 
M, 


v" F" Mas + Ma 


kas all positive roots if either 0:1 < d < X; for some i or & < d and has one 
negative root if 


AX <d < 8; 


for some 7. 

Proor. We first note that D(A) is (apart from a constant factor) the orthog- 
onal polynomial degree k with respect to dF. Hence ((6], p. 43) the A, are all 
positive and distinct. By Propositions 1 and 3 there is an arithmetic distribution 
G, defined on (— ©, ©) whose spectrum consists of the zeros of Qa(x), and which 
has Myo, --- , Mx as its (algebraic) moments. By Proposition 2 we see that the 
zeros of Q.(x) and Q3(x) separate each other. Consequently, as d increases, the 
zeros of Qa(x) must increase until the largest becomes + ©. Proposition 2 also 
shows that Qz can have at most one negative zero. Hence for d < 0, d is the only 
negative root of Qz. As d increases all roots of Qz increase and so for d slightly 
larger than zero all roots are positive and remain so until the largest becomes 
infinite, but this happens only when the coefficient of x*** in Q, vanishes, that is, 
when D(d) = 0. Thus all roots are positive if 0 < d < i, . For d slightly larger 
than \; we have a large negative root which increases to zero as d increases, and 
Qa has a negative root for 4; < d < 6;, where 6;, is the smallest root of A(@) 
which is larger than d. Continuing this process we see that the roots of A(@) 
separate those of D(A) and that the lemma is true. 

THEOREM 3. Let n = 2k be even, and let \; and 0; be defined as in Lemma 4. 
Then if 0:1 < d < d; or 0, < d, the spectrum of Fa consists of the roots x; of 
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Mo «+ Mer 
M, 


f° #* me + Mm 


The mass m; concentrated at x; is given by 
k 
‘ ( 
(2) n, = cy M, 


i=0 


whe re 
LE 


G (st (x) = 4 E t == ee. ; 
v @) a me Oe = a 
If 0; < d < d, the spectrum of Fz consists of © and the roots x; of 
x d M, - Min 
R(x) = Rax(x) —_ 


J +1 qé*} Mes . Mo.) 


The mass m, at x; ts given by 


(3 ni; = 2. bj M, 
l=0 


. : Ra(x) 
R(x) = Dd bf2' = 
1 


w FP e 
ta(21)(x — 2;) 


t 


Proor. If 


61<d<X, or a<d 


then the characteristic distribution Fz of Proposition 1 is by Proposition 3 a 
characteristic distribution for the Stieltjes problem, and the masses are as stated. 
Suppose on the other hand that F, as guaranteed by Theorem 2 had mass at 
d and the k other points z;,--- , 2, all different from zero or infinity. Then 
Q.(x) is orthogonal to all polynomials of degree k — 1 or less. Thus 


0 = | Qi(a)(e — x) +++ (et — ayer) dFa(x) = m. QA) 
o— 


since Q,(d) = 0. Hence 2; is a root of Q2(7) and so must the other z, be, and we 
must have all roots of Q.(x) nonnegative in this case. Thus in the case of a nega- 
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tive root of Qu we must have the spectrum of F, consisting of at most 0, d, in- 
finity and the k — 1 points x,, --- , 2:,. Now R,(z) is orthogonal to all poly- 
nomials of degree k — 2 or less and vanishes at zero and d. Consequently 


0 = I Ra(x)(x — x) «++ (@ — aye) AF a(x) = me Ral e-) 


and z,_; is a root of R, . Similarly with the remaining z,’s. 

Thus we need only verify the expression for the masses m;, which follows 
immediately from the fact that R(x) is of degree less than n and vanishes at 
all spectral points of Fg except x; where it has the value one. 

A similar argument gives 

THEOREM 3’. Let n be odd and k = (n — 1)/2. Then if 


- 


Gs e = hy or a.<d 


with d; and 6; as in Lemma 4, the spectrum of Fa(x) consists of infinity and the 
roots x; of Qax(x) = 0, and the masses m; at x; are given by (2). If \i < d < 4 
the spectrum of F4 consists of the roots x; of Raxis(x) = 0 and the masses m; at x; 
are given by (3) with Rax(x) replaced by Rax+,(x). 


4. Some special cases. If XY isa random variable with algebraic moments .1/, 
and M; given, we can use Propositions 1, 2, and 3 to calculate the inequalities 


2 
0 < Pr.(2 sa) = ———2 mh 


a, i>M 
M. — M? + (M, — a)’ ; 1 


1 tm, SO St 
M.- M+, -o?*"*" E> 
This is the well known inequality of Tchebycheff. 
If we know in addition that X is a positive random variable (say the absolute 
value of another variable) then Theorems 1, 2, and 3 enable one to calculate 


M: — Mi 


0s Pr.(X s¢@ s ——— <5 
M, — Mi + (M, — a)” 


<d<M, 


hd ses eM: 


a == . Me 
oe Sar et . et Pee <d 
M, — M, + (M, — d)° M, 
which are the Cantelli inequalities [7]. We see that they are an improvement over 
the Tchebycheff inequalities in the region M, < d S M2/M,. 
5. Unimodal distributions. In this section we determine the (sharp) upper 
and lower bounds for Pr. (X < d) of a unimodal distribution when its first two 
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moments about the mode are given. It should be emphasized that this is a distinct 
problem from that in which the moments of orders a and 2a are given, since one 
of the transformations 


X* = X”*; X = (x*)"* 


does not take unimodal distributions into unimodal distributions if d # 1, 
although the methods used below should work equally well in the case of any 
two moments. 


Mathematically, we consider distribution functions of the form 
z 
(4) F(X) = FO) + | ¢(@) at 
“0 
where g(t) is a nonincreasing function of t, and we are given 


(5) [ dF(X) = 1, I X dF(X) = M,, [ X? dF(z) = Mo. 
6. o— 


) 


Every distribution function F of the form (4) is the limit of distributions of the 
form 


(6) F = 7 mF; , ;< > m, = 1, 
where the /’, are rectangular distributions, that is, 


X 


F(X) = 5 & 
1 xX 


0 X=0 
{1 X > 0. 


For brevity we call these the rectangular distribution at ¢; or at zero, and say 
that F has rectangular distributions of strength m; at ¢; . 

We consider the space $ of all convex combinations of five rectangular dis- 
tributions whose second moment does not exceed some fixed constant M and 
compactify it (as in Section 2) by the addition of an infinitesimal distribution 
at infinity. Then we have the following lemma by the method of Lemma 2. 

Lemma 5. Let F(X) be a unimodal distribution on (0, ~] and d a given positive 
number. Then there is a distribution F* in & for which 


F(d) = F*(d@), 


Fy = 


X dF(X) -| X dF*(X), [ X? dF(X) - f X? dF*(X). 
( 0- ~—_ ies 


Thus we need only consider convex combinations of a finite number of rec- 
tangular distributions. Since the upper limit for Pr. (X¥ < d) is a continuous 
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function of d, we consider only cases in which there is a rectangular distribution 
at a finite point larger than d and take 1 for the sharp upper limit of Pr. (X S d) 
after this upper limit once becomes one. 

If X is a positive random variable with a unimodal distribution function and 
moments 1, and M, , we normalize by taking x = X/2M,. Then x has median 
4 and second moment M2/4M{ = A/3. It is known [1] that a necessary and 
sufficient condition for VM, and M; to be the first two moments of a unimodal 
distribution is that A = i. If now F = 7 m:F; + \F,, , then 


l= > om m, 1 = _ mit; A = 7 mi, +X. 


Suppose that F had rectangular distributions on at least two finite points ¢, 
and t. which are greater than d. Then 


m, Mes 
Ae mas 


r 3 (; < d) = (™ 
\& le 


Jdte 


where c is independent of m, , m., t; and t.. In order for P to be a maximum or 
a minimum with the total mass, first and second moments fixed, the Jacobian 
of these quantities with respect to m, , mez , t; and t2 must vanish. But this Jacobian 


1s 


d d —dm, dm 


 & ‘ ts 


mi med — 


1 0 0 Bie a Ey oo) 
be (& t. x ° 


t, te my; Mes 


th tp 2myt, Zmet, 


Hence an extremal distribution function must have a rectangular distribution on 
exactly one finite point ¢ > d. Suppose now that an extremal distribution had 
rectangular distributions at two points &,, 42 which are less than or equal to d, 
and one of which (say ¢;) is different from zero and d. Then the Jacobian with 
respect to 4, m,, m, and m; would have to vanish. But this Jacobian is 


0 


0 
my 
2m, t 


In a similar manner it is readily verified that an extremal distribution which 
has a rectangular distribution at a point 4,0 < 4 < d, can have no rectangular 
distribution at infinity, and that no extremal distribution can have rectangular 
distributions at zero, d and infinity. Thus an extremal distribution must belong 
to one of the following cases. 
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CasE 1. Rectangular distributions at exactly two points t; and t with 0 < 4, < 
d < t. Then 


md 
t 


(8) P = Pr. (x S d) m + 


and 


(9) 1 = m + Mm 1 = mt, + met A= mii + mat’. 
In order for P to be an extremum 


d 0 ——- 
t 2 
1 Oo o te moe (2t? — 3td + td), 


t Me 
t 2m, t; 2met 
and hence 


3t? 


(10) d= 31-1," 


In order for there to be numbers m, and m, for which equations (9) are satisfied, 
we must have 


1 
h ¢ 1) e2a-—t—4(1-—9, 


° 


i fA 
whence t; = (t — A)/(t — 1). In order for ¢, to be positive and less than ¢, ¢ must 


be greater than A, solving (9) 


(¢ — 1)? A-1l1 
ae — Mo 


4-146 — 47 sl + = 1) 


mm = 


and thus 


P= 5+ 
11) ~ SU —4t+A 


and 


8B — 


= 2 roterhaattetenese 
d “30 — 4+ A° 


Since t = A, Case 1 can only arise when d = 2A/3. 
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Case 2. Rectangular distributions at zero, d, ¢ and possibly infinity. In order 
to have an extremal distribution we would need 


d — ms d 


t e 


») js 
l 0 = (d — t)? #0. 


t Ms 
2mset 
Thus Case 2 never occurs. 
Case 3. Rectangular distributions at d, ¢ and infinity. Then for an extremal 


distribution 


d —m,d 


I 0 


(t — d)° 
7 i - dd) £0. 


” 


0 
dt? 2met 1 


Thus Case 3 can never siise. 
Case 4. Rectangular distributions at d and ¢ only. Then 


(12) m +m, = | md+ mt = 1 md +mt =! 


and 


A—-d—ti-—d), 


or 

A 2 

(13) t=. 

In order to have 0 < d < t, we must haved < 1. Solving (12) using (13), gives 

A-1 A- 1 

m"~a-1t1+(-dad* ™~a-14+(1-d)’ 

t.45.8-8 

t 


= 1 a. 
A—d 


and P = Pr. (x S d) = m + m 


ra - 


Case 5. Rectangular distributions at zero, ¢ and infinity with 0 < d < ¢. 
Then 


m +m = | mt = mtl+rA=A 





BOUNDS ON A DISTRIBUTION FUNCTION 


whence m, = 1/t, and consequently 1 < ¢t S A. Now 


P =Pr.(z S$ d) = m +m-=1—--+ 


d 1 d 
t ele 


and P can be an extremum only if t = 1,¢ = A, ordP/dt = 0. In the first alterna- 
tive we have 


The second alternative gives 


The last alternative arises when 1/ = 2d/f or d = 1/2. Then 
P=1-1/4d, }4<s¢484/2. 
We summarize these results in the following theorem. 


THrorem 4. Jf F(x) is a unimodal distribution whose first and second moments 
are 1/2 and A/3, then 


_ a-d’ 


"(d) 
Fid) <= 1 ey 


forO s 


F(d) s 


F(d) 2 
F(d) = 
F(d) = 


Fi(d) = 


where 
8 == f 
da c= 7 » . 
3t° — At + A 
These inequalities are the best possible. 
Removing the normalization from F, we get the following theorem. 
TuroreM 4’. If F(x) 1s unimodal distribution with moments M, and M, , then 


_ (2M, — d)’ 
3M. — dM, 


F(d) s 2M,sd 


F(d) = 1 


0 “ d < 2M, 
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M, 
2d 
4M. 8Mid 
3M, + OM} 
aan? 
30° — 4t +A 


= 


a= a :. éeehior 
ol 


—4t+A° 
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ON SOME THEOREMS IN COMBINATORICS RELATING 
TO INCOMPLETE BLOCK DESIGNS 


By KuLenpra N. MAJUMDAR 


University of Delhi 


1. Summary. In this paper we have studied certain combinatorial properties 
of incomplete block designs and efficient necessary conditions for the existence 
of affine resolvable balanced incomplete block (b.i.b., for abbreviation) designs. 
Two theorems give combinatorial properties of certain b.i.b. designs. The well 
known inequality of Fisher between the number of varieties and number of 
blocks is shown in this paper to hold under very general conditions. An intrinsie 
characteristic property of symmetrical b.i.b. designs is given in another theorem. 
In the last two theorems we have deduced efficient necessary conditions for the 
existence of affine resolvable b.i.b. designs. Besides these there are some minor 
results. Utilizing the simple vet very fruitful idea of associating an incidence 
matrix with a design, all the results are deduced with the help of arguments of 
algebra of matrices and linear equations. The last theorem requires the use of 
the celebrated four square theorem of Lagrange and a result due to Legendre in 
number theory. 


2. Introduction. An arrangement of objects of v different varieties into b 
blocks (or sets) in such a way that (i) no two blocks are identical (i.e., contain 
the same varieties), (ii) a variety occurs at most once in a block, (iii) no block 
contains all the varieties, may be called an incomplete block design. If an in- 
complete block design satisfies the further conditions that (iv) all the blocks 
are of equal size (i.e., every block contains the same number of objects or varie- 
ties, say, k) and (v) any pair of varieties occurs together in the same number, 
say \ (where \ ¥ 0), of blocks it is called a balanced incomplete block design. 
These designs were introduced by Yates in applied statistics. It easily follows 
that every variety is replicated (i.e., occurs in the whole design) the same num- 
ber, say r, of times in a b.i.b. design. For, consider the r; blocks in which the 
ith variety occurs. Each of the other varieties must occur together with the 7th 
variety in \ of the r, blocks considered. The total number of objects in these 
blocks is therefore (v — 1)A + r; . Also this total number equals r,/, so that 


(2.1) r;= Ao — 1) = 


ms 


i I: — s r, 


say. Counting the total number of objects in the whole design in two ways we 
get 


22) vr = bk. 


Received 10/28/52. 
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A b.i.b. design in which b = v is called a symmetrical b.i.b. design. (2.1) and (2.2 
are trivial necessary conditions for the existence of a b.i.b. design with the 
parameters v, b, r, k, X. 

A v X b matrix A completely characterizing an incomplete block design can 
be constructed in the following manner. List the varieties in a column and the 
blocks in a row. Insert 1 or 0 at the intersection of ith row and jth column 
according as the ith variety occurs in the jth block or not; 7 = 1, 2, --- e379 = 
1, 2, --- b. The sum of the elements of a row of an incidence matrix for a b.i.b. 
design is r while the similar column sum is k. The scalar product of any two 
row vectors is \. Hence for a b.i.b. design, 


2.3) 


| AA’ | = (r — d)""'(r + Aw — 1) 


; ail ' : ; 
= | AA’| = (r — X)' Fr for a symmetrical b.i.b. design. 


3. General nature of Fisher’s inequality. l’isher’s inequality |1] namely, b 2 », 
for a b.i.b. design has been proved by several authors by different methods of 
which [2] may be noted. The following discussion reveals the very general nature 
of the inequality. 

The rank of an arbitrary v X } matrix A can exceed neither the number of 
its rows nor the number of its columns and 
(3.1) rank A = rank A’ = rank (A’A) = rank (AA’). 

If rank (AA’) = ¢ then ¢ S min (), v), so that for the inequality b 2 v to hold 
it is enough that | 4.4’ | # 0. Consider, for instance, the matrix A which satis- 
fies the conditions, 

(i) the scalar product of any two, without loss of generality, say, of its first 
c row vectors is \y , Where 0 < Ay < min(m, 72, +++ , Te , and where r,; denotes 
the square of the length of the 7th row vector. Similarly the scalar product of 
any two of its other v — c row vectors is \, where 

O-< As < mine, Tose s °° » Teds 


(ii) the scalar product of any of the first set of ¢ row-vectors with any of the 
second set of v — ¢ row vectors is \ where X S )jA2 


r 
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To evaluate this determinant subtract the cth row from the previous rows and 
similarly the rth row from all rows above it and below the cth. Add suitable mul- 
tiples of the first, second, --- (ec — 1)st column to the cth column so as to make 
its first (ce — 1) elements 0. Similarly reduce to0 the (ec + 1)st, --- , (v — I)st 
elements of the last column. Thus 


e—l 


c—1 
AA’! = JJ (r, -—a,) IT Oo: — ») 


(3.3) t=1 t=—c+1 
-{ror, — dX” + de(Aid2 — d*) + d(r,A1 — A*) + e(r-A2 — A’)}, 


where 


— | v—c—l 
ome Ai 


(3.4) ge) = > Toh 


AT e 
tal 7°; — Ay tml Nei de 


result concerning the shape of a matrix when (i) and (ii) are satisfied. In particu- 
lar, if we suppose A to be an incidence matrix for an incomplete block design 
(when of course the scalar products and lengths of row-vectors will have obvious 
interpretations in terms of varieties and blocks), we get the inequality 6 = v 
for a more general class of designs from the above result provided we replace 
(ii) by the condition X S min (A; , Az). Plainly, imposing diverse and more gen- 
eral conditions on A we can ensure that | AA’ | # 0. 

Before leaving this topic we note that if /;; is the number of varieties common 
in the 7th and jth blocks, 7 = 1, 2, --- 6;7 = 1, 2, --- bin any incomplete block 
design, then the rank of the matrix (l;;) is equal to the rank of tae incidence 
matrix. Trivially, every v X v minor of the determinant /;; | , situated symmetri- 
cally about the main diagonal is a perfect square. For a b.i.b. design at least one 
of these minors is a nonzero perfect square since at least one set of v columns 
of the corresponding incidence matrix is independent. 


Clearly | AA’ | > O and consequently b 2 v. We observe that this is a general 


4. A characteristic property of symmetrical b.i.b. designs. What is the nature 
of an incomplete block design in which every pair of varieties occurs together in 
\ blocks and every pair of blocks has \’ varieties common? The answer is given 
in the following. 

Tueorem 1. Jf in an incomplete block design every pair of varieties occurs to- 
gether in \ blocks and any two blocks have \’ common varieties, then the design is a 
symmetrical b.i.b. design. (In case \ = 1 we further assume that there are at least 
two blocks, each containing at least 3 varieties.) 

We observe that not both \ and ’ can be 0. We consider two cases according 
asx = lorA> 1. 

Case I. \ = 1. We give an indication of the proof. For this case use the terms 
“points” and “lines” instead of varieties and blocks. 

When \ = 1, X’ is also | as can be seen by considering two intersecting lines. 
Now, there cannot be any line with only two points on it. For, it is easy to see 
by drawing a diagram or by considering the incidence matrix that this implies 
that the system consists of a set of concurrent lines and a transversal, each line 
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except the transversal having only two points on it. But by our assumption we 
have at least two lines with at least 3 points on each. Hence every line must con- 
tain at least 3 points. Thus the system becomes a finite plane projective geometry 
of Veblen and our theorem is a well known result in that geometry. We notice 
here that the third postulate of that geometry, namely, every line contains at 
least 3 points, can be replaced by the postulate that there are at least two lines 
each containing at least three points. 

Casze Il. A> 1. Let ry, re, «++ , re respectively denote the numbers of replica- 
tions of the varieties and hk, , ke, --- , k» respectively be the sizes of the blocks. 
[f A is the incidence matrix of the design we have 


hy A(v sa 1) + ry 


ke Av -— 1) + hr 


1) +r, 
+h 
1) + ke 


and 


” et ’ oie | 
(4.3) AA’ =| Me °: Al 2 oe NY 

5 d “ve fe , es | 
Premultiply (4.1) by A’ and use the value of A’A from (4.3). We then get 


Av—-l+n 


Av - D+nr 
= A’ : 


Atv — 1) + 7, 
ky X(v - 1) 


ke A(v — | ) 
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and using (4.2) we obtain 


kv(v — 1) Mb-1I +h 
ked(v — 1) Vb-1 +k 
4 : 


ky lyd(v — 1) (b— 1) + 3 


The ith equation from (4.5) is 
(46) M+. Di - NE =k Ae - 1D +VGO-—D+k; 7 
j 
that is, 
(4.7) K— kN + do — A+ 1) + V'm — VN(OO— 1) =O; 
where m is the total number of objects in the design. If a, 8 are the roots of (4.7) 


then k, = a or 8. We now show that either all /,’s are equal to a or all are equal 
to 8. If possible, let k; = a; k; = 8. In that case, 


(4.8) ezkt+k—W = a4+8—W =dAw— 1) +1 


which is absurd unless \ = 1. Hence all the block sizes are equal, to k, say. 
From (4.1) or from (2.1) we get 


_Me-D). 4, 
(4.9) es 


Finally | AA’ | = (r — A)" “(r + AW — 1)) € O; 
— n’)"(k + \(b — 1)) #0. 


Therefore, rank (4A’) = v and rank (A’A) = 6 and thus b = v. From the rela- 
tion vr = bk, it then follows that r = k and from the relations A\(v — 1) = r(k — 1) 
and \’(b — 1) = k(r — 1) we get A = . This proves that the design is a sym- 
metrical b.i.b. design. 


5. Two combinatorial properties of certain b.i.b. designs. Let us now consider 
the following solution of the b.i.b. design v = 16, b = 24,r = 9,k = 6,4 =3 
given by K. N. Bhattacharya [5]. 
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(1, 2, 7, 8, 14, 15) (3, (2, 
(3, 5, 8, 9, 12, 14) (1, (2, 
(3, 4, 7, 10, 12, 16) (3, (4, 
(2, 4, 9, 10, 11, 13) (3, 6 (1, { 
(1, 4, 7, 8, 11, 16) (2, (5, 
(1, 6, 8, 10, 12, 13) (1, (2, 


(1, 4, 5, 13, 14, 16) (2, (1, 3, 9, 10, 15, 16) 
(4, 6, 8, 9, 11, 15) (1, 5, § (11, 12, 13, 14, 15, 16) 


where 1, 2, --- , 16 denote the 16 varieties. If we compare the numbers of varie- 
ties common to the 20th and 24th blocks (counting is done in a vertical way) 
and other blocks, and similarly for the 6th and 10th blocks, we obtain the follow- 
ing tables. 


Number of varieties 
common with 


Number of varieties 
e n wit 
Block ommon with 


Block 20 Block 24 Block 6 Block 10 


1-5 
7-9 
11-12 

13 
14-18 
19 
20 
21 
22 
23-24 


2W bo 


W bo to bo 
m bo bo tbo 


2 
3 
2 2 
3 3 
2 
3 


° e ~ - 
to to W bo 
7S = tO 


to bo 


N= w bo 
wwe nh why 


The two right-hand columns are identical in the first table. A superficial inspec- 
tion of the second table reveals that the sum of its two right-hand columns gives 
a constant. This led the author to conjecture Theorems 2 and 3. The first proofs 
were deduced from the main theorem given in the doctoral dissertation [6] of 
Connor. Later on independent proofs were obtained and these are given below. 

THEOREM 2. The necessary and sufficient condition for the existence in a b.i.b. 
design of two blocks such that the numbers of varieties common to any other block 
and these two are equal, is that these two blocks contain k +  — r common varieties. 

Let two blocks (which we take to be the first and second blocks) contain ¢ 
common varieties. Let the ith block contain x; varieties which occur in the first 
but not in the second block. Similarly it has y; varieties which occur in the second 
but not in the first block. Considering the combinations of varieties taken two 
at a time we have 
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b 


> rz; — 12) = Dydys -— 1) = (k — Ok —e — DA — 1) 


t=3 t=3 


b 


De ty (k — c)*r 


t=3 


b 


(5.3) Doz, >, He = (k — c)(r — 1). 


t=3 t= 


Using these we have 


5.4) > (2: — y)* = Ak — ole +r—rA— k). 

t—3 
lic = k + \ — r we must have x; = y; for all ¢ and conversely. Hence the 
theorem. 

‘THEOREM 3. The necessary and sufficient condition for the existence in a b.i.b. 
design of two blocks, such that if any other block has s varieties in common with 
the first block it has 2\k/r — s or 2kr/b — 8 varieties in common with the second, 
is that the two blocks have r — \¥ — k + 2dk/r or 2kr/b — k varieties in common, 
respectively. 

As before, let two blocks have c common varieties. Let the 7th block have z, 
varieties which occur in the first but not in the second block, y; varieties which 
occur both in the first and second blocks, and z,; varieties which occur in the 
second but not in the first block. Then 


b b 
> x(a; — 1) >. 2,(z; — 1) = (k —ec)(k —ec — IA — 1) 


t=3 


6 
D az (k — 6)*A 


b 
De tiyi = Dyes 


i=3 


b 


z. Yil¥i a 1) 


t=—3 


b 
(5.9) > x; 


t=—3 
b 


(5.10) doy: e(r — 2). 

t= 
We now form the sum 50 %Ws (a; + 2y; + 2; — w)* where w is the mean of the 
variable x; + 2y; + 2; , that is, w = (2(r — 1)k — 2c) /(b — 2). Using the results 
(5.5) to (5.10) and (2.1) and (2.2) one finds that the cumbersome sum factorizes 


as 


(5.11) r-—\— —c}ic — +i). 
».11) —5(: nN k+ ; e\(« ‘ie ] 
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Ife =r—rX—k + 2k/r we have 2; + 2y; + 2 = w = 2dk/r for all 7. 
Conversely if x; + 2y; + 2; = 2Ak/r we must have c = r — A — k + 2Ak/r 
since w = 2dk,r in this case. As 2; + 2y; + 2; represents the sum of the num- 
bers of varieties common to the ith block and the first and second blocks, the 
first part of the theorem follows. Similarly for the other part. 

As a by-product, we get from (5.4) and (5.11) the inequality 


edi 2rk 2rk 
(5.12) max ( a kk +r r) <lsr-A-k+— 
) r 
where / is the number of varieties common to any two blocks. This inequality 
was obtained by Connor [6] by a different method. 


6. Necessary conditions for the existence of affine resolvable b.i.b. designs: 
If in a b.i.b. design the blocks are separable into groups such that the blocks 
in any group (or “replication’’) contain between them all the varieties, each 
variety occurring once and only once in the replication, the design is defined to 
be a resolvable balanced incomplete block (r.b.i.b.) design. Of course, all the 
replications necessarily contain the same number of blocks, say n, and then 


(6.1) v = nk; 


Some properties of r.b.i.b. designs were studied by Bose [3]. These designs are 
of special importance in analysis of variance. Here we are interested only in 
their structural properties and will study them through incidence matrices. 

In forming the incidence matrix A of a resolvable design we arrange the blocks 
in such a way that the consecutive columns of A correspond to the blocks in a 
replication. Thus the first » columns correspond to the n biocks of the first replica- 
tion, the next » columns to those of the second replication snd so on. An impor- 
tant piece of information about A is that on any row, the portion cut off by the 
n columns corresponding to a replication, contains | once and only once. This 
fact will be fully exploited in Theorems 6 and 7. Since the sum of the m column 
vectors corresponding to any replication is a column vector of 1’s, we get Bose’s 
inequality b — r+ 1 2 rank A = v by (2.4), for r.b.i.b. designs. The r.b.i.b. 
designs for which 


(6.2) =v+r-—l1 
or equivalently, 
(6.3) r=k+h 


are called affine r.b.i.b. designs. These possess some properties somewhat analo- 
gous to those of symmetrical b.i.b. designs. We give here an alternative proof 
for such a property due to Bose. Consider the n blocks constituting a replication. 
Since the number of varieties common between any two of these blocks is k + 
\ — r, that is, 0, Theorem 2 can be applied. If therefore any block not belonging 
to this replication has ¢ varieties common to the first block, it must have c 
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varieties common to the second and so ¢ varieties common to the third and 
so on successively. Since the n blocks are disjoint and contain all the varieties 


—_ k 


n v 


(6.4) = k, or, - 
Using (2.1), (6.1) to (6.4) it is not difficult to show that the parameters of an 
affine r.b.i.b. design can always be expressed in terms of two integral variables 


as 


v= n(n't — nt + n), bh = n(nt +n +1), r=nt+n+4+1, 
(6.5) 


k=nt—nt+n, A = nt + 1. 


From the property in (6.4) it immediately follows that if we choose a block 
in an affine r.b.i.b. design and break up every other block (not belonging to the 
replication to which the chosen block belongs) into two subblocks, one containing 
the k’/v varieties which occur in the chosen block and the other containing the 
residual varieties, then the first set of b — n subblocks constitutes the r.b.i.b. 
design, 


v = n(nt —it+ 1), b’ = n(n't + n), =nt+n, 
(6.6) 


ko =nt—t+1, ’ = nl. 


Though numerous combinatorial structures like b.i.b. designs, finite projec- 
tive geometries, etc. have been constructed and a large number of sufficient 
conditions accumulated [4], yet even now very little is known about necessary 
conditions for their existence. Only recently Bruck, Schiitzenberger, Chowla, 
Ryser, Shrikhande, Hall, Mann and others have obtained some important neces- 
sary conditions for the existence of finite plane projective geometries, symmetrical 
b.i.b. designs and cyclic projective planes. 

For symmetrical b.i.b. designs the following two theorems are known. 

THroreM 4. If for given r, and even v a symmetrical b.i.b. design exists, 
then r — is a perfect square. 

This theorem seems to have been obtained first by Schiitzenberger [7]. It 
was independently discovered by Chowla and Ryser [8], and Shrikhande [10]. 
The proof is almost trivial and follows immediately from (2.5). 

TMiroreM 5. (Chowla and Ryser). If for given r, \ and odd v a symmetrical 
b.a.b. design exists, then 


where p is any odd prime which divides the square-free part of r — \ and (m,n) 
is the Legendre symbol in number theory. In case p divides \ we will have 


_ 4)\8—1) f_ y)ieet 
( 1) s) ts | 2 ai) » 
Pp 
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according as the highest power of p which divides \ iseven or odd. Here \; and a 
respectively denote the greatest divisors of \ and r — A, which are prime to p. 

The proof given by Chowla and Ryser in their important paper [8] is remark- 
ably ingenious and simple. By straightforward generalization of Bruck and Ry- 
ser’s result [9], Shrikhande independently deduced necessary conditions in terms 
of Hilbert’s norm residue symbols. The last sentence in the statement of Theorem 
5 does not occur in [8]. It is due to the present author and can be deduced from 
Chowla-Ryser’s proof. 

We now establish some necessary conditions for the existence of affine r.b.i.b. 
designs. Theorems 6 and 7 are precisely analogous to Theorems 4 and 5. These 
results were obtained some time ago. Shrikhande, working independently on the 
same problem, has obtained similar results, which he announced without proof 
in [12]. It may be mentioned here that the proofs given below are direct and self- 
contained. 

THEOREM 6. [f for given b, r, k, \ and odd v an affine r.b.i.b. design exists, then 
k or k’/v is a perfect square according as r is odd or even. 

Adjoin r — 1 row vectors to the incidence matrix A of the affine r.b.i.b. design 
so as to get a square matrix A*. Let the 7th of the new row vectors have I’s at 
the (i — 1)n + Ist, (¢ — 1)n + 2th, --- , (¢ — 1)n + nth positions in the vec- 
tor while the other positions are occupied by 0’s, 7 = 1, 2, --- ,7 — 1. Remem- 
bering the special nature of the incidence matrices of r.b.i.b. designs noted above 
we have 


and 


. — wy 
= (ry —))" (x on = oe + rv 


n 


r—(r—1)- +20 — l)=r—(r—-Dk+r(k — 1) =k, 
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From (6.8) it follows that k is a perfect square when v and r are odd. When v 
is odd and r is even, v must equal a perfect square. From (6.4) we know that 
k°/v is an integer and so when v is a perfect square, /° /v is also so. This completes 
the proof. 


As an application of the theorem consider the design v = 45, b = 66, 7r = 
k} = 15, X = 7 obtained from (6.5) by putting n = 3, ¢ = 2. An affine r.b.i.b. 
design with these parameters cannot exist as v and kv are not perfect squares 
though » is odd and r is even. Similarly an affine r.b.i.b. design with the param- 
eters v = 63, b = 93, r = 31, k = 21, = 10 is impossible. The impossibility 
of the latter design was demonstrated in [11] by a different method. 

THeoreM 7. If for given v, b, k, \ and r = 2 or 3 (mod 4) an affine r.b.i.b. de- 
sign exists, then every prime of the form 4a + 3 which divides v/k, occurs to an 
even exponent in the standard form of v/k (i.e. when v/k is expressed as a product 
of distinct: prime powers). 


Introduce a column vector X of rational variables 2 , x2, --- , 2y4r-1 . Put 
(6.9) x’ A" = (a ‘eee Uy4r—i)- 


Then from (6.7) we have 


a | ’ 


v+r—1 : - 
> ui = N’A*A*X =(r—-avA Daitn D ata (= x) 


r=) t=1 i=—v +1 t=] 


(6.10) 


+2(Eal(Z,-), 


that is, 


r+r—l 9° 


(6.11) 7 u, =k 7 a tn - ait (= r.) +2 (x r\( > n,). 


s=1 t=1 t=—v+1 t=—l t=] t—o+1 


9 


Case I. v odd, r = 3 (mod 4). In this case k = m° where m is an integer by 
Theorem 6 and so k )-3.; x; = o%-1 (mz,)’. Now by Lagrange’s four square 
theorem every positive integer can be expressed as a sum of four integer squares. 
So letn = a +60 + ¢ + d where a, b, c, d, are integers. Then 


r—7 
vr+r—1 


ei 
nd, xy = nrg + tra) + DL 


t=r+1 t=() 


9 
, { (OX v44i41 + bressige + CXes4i43 + AX 44544) 


. 
+ (b% 44:41 — OX 44:42 + AXep4iss — CXvgsiza) 


sw (— CI o44i41 + AX 4p4i42 =} AX 44143 — bX y4-4544) 


+ (AT es4g1 1 CXrgsig2 — ON eg4i4: — ATezains) | 











388 KULENDRA N. MAJUMDAR 





From (6.9), X’ = (wu, U2, °** , Ueqr1)A*. Consequently the x’s in (6.11) can 
be replaced by the w’s. Utilizing (6.12) we can write (6.11) as 


v+r—l v+r—3 
* 46 2 2 2 2 2 ‘ 
(6.13) 7 u; = Zz Ys + N(Yos, 2+ Yo+r—1) - AYo+r + 2Y vir Yosr+t 
t=1 i=l 


where the y’s are linear forms in the u’s with rational coefficients. 

Now set 7 — m = O if the coefficient of 1% in y, is not one, otherwise set 
“+t uy = 0. Next set yo — ue = 0 if on elimination of uw from it by means of 
the first equation the coefficient of u. in the resulting expression does not vanish; 
otherwise put y2 + ve = 0. Similarly set ys; — uz = 0 if the coefficient of us in 
it after the elimination of wu; and uw by using the first two equations, is not 0, 
otherwise put y; + wu; = 0. Proceed in this way up to the v + r — 3rd stage. 
Finally, eliminate 1 , ve, +--+ , Ue4--3 from y,4, = O using the other equations. 
The resulting equations are equivalent to a system of the following form: 


QyUy + Ayztlte +, +++ y H Ate4riUeyri = O 


Ag2N2 + aa + Gav4r-1Urtri = 0 


(6.14) 


Agile + Ateqitigs = O, 


where ¢ = v + r — 2 anda, ~ 0,7 = 1,2, --- ,t — 1. A little reflection on 
(6.14) shows that there exist integral solutions (wu , ts , -++ , Ue+r-1) for which 
=wu;,,t = 1, 2, 

,u +r — 3, and y,4, = 0. Consequently, with this choice of the u’s and 
because of the homogeneity of the relation (6.13) we arrive at the nontrivial 
relation, 


’ 
2 
‘ 


at least one of u,4,-2 and u,4,-; is a nonzero integer. Thus, y 


(6.15) Ustr-2 + Usira = n(p +q) 


where all the quantities are integral. 

Case II. v even, r = 3 (mod 4). Considering various combinations of n, t 
(mod 4) it is seen from (6.5) that v = 0 (mod 4) for this case. Express k Din ry 
and n eee x, as sums of squares of linear forms in the 2’s as in Case I and 
proceed as before. Ultimately we get a nontrivial relation exactly similar to 
(6.15). 

Cask III. r = 2 (mod 4). From (6.5) it follows that » = 1 (mod 4). By The- 
orem 6, kn = s where s is an integer. Multiply (6.11) throughout by » and 
express yee uv, as a sum of squares of v + r — 3 linear forms 1, in the wu’s 
(i.e., in the z’s by (6.9)). We then get 


v+r—3 v t+r—l 

2 2 2 2 2 
Zz Ys N(Usgr—2 HF User) = > (sx;)* + ys (nx ;) 
to] t==1 to0+1 


(6.16) 


9 
/ 


v 2 v v+r—l 
+ An (> x,) + 2n » x.) ( 7 x.) 
t=] t=] t=r+1 
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Choosing x’s such that y, = +sx,; ; 7 = 1, 2, 7% Ys = enzyy3t =v 4+ 1, 
o+2, - v+r—B3and >, x; = 0 mas as in Case I we obtain a 
eal re idles 


(6.17) n(uz pre ot i. ) = Pi a qi 


where all the quantities are integers. 

So for all the cases it follows that n can be expressed as the sum of two integer 
squares. An appeal to a result of Legendre in number theory completes the proof. 

As an application let us consider a design with the parameters v = 216, b = 258, 

= 43, k = 36, = 7. An affine resolvable b.i.b. design with these parameters 
cannot exist by Theorem 7 since here v/k = 2-3. An affine r.b.i.b. design cor- 
responding to n = 21, ¢ = 1 in (6.5) cannot exist —this is decided by Theorem 7 
though the parameters satisfy the condition of Theorem 6. Manifestly these 
two theorems are independent though in many cases both lead to the same con- 
clusion. Finally we observe that the powerful and general theorem of Minkowski 
and Hasse [9] on the arithmetical reduction of quadratic forms, when applied 
to (6.7) in conjunction with Theorem 6, gives us Theorem 7 and nothing more. 
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THE BEHRENS-FISHER PROBLEM FOR REGRESSION 
COEFFICIENTS 


By D. A. S. Fraser 
University of Toronto 


1. Summary. For two normal populations with unknown variances and means 
depending linearly on p + q regression variables, a Behrens-Fisher generaliza- 
tion is to test the equality of qg regression coefficients in one population with a 
corresponding set in the second population. When g = 1 a general class of similar 
regions is obtained for the hypothesis, and for regions restricted to this class a 
most powerful or most powerful unbiased test is found. When q > 1 several 
tests are presented and discussed. 


2. Introduction. 
2.1. The problem. Let U be a normally distributed random variable with mean 
>a Ba, + ba T.y. and variance o ; the location of the distribution depends 


linearly on the “factors” 2, --+, Up; Y1,°°*, Yg- The regression coefficients 
8,forr = 1,---, pand T, for s = 1,--- , g are unknown but fixed constants 
while the factors 41, °-:, 2p; ¥i,°**» Yq Can be fixed or recorded for each 


observed value of U. Similarly let U* be normally distributed with mean 
oe cat i : os mm : 
> Pubrpt, + > 7.1 Tey, and variance o . The problem is to test the hy- 
* 
s= 1,0 Q). 


pothesis 7, = {T, = T, 
Consider a sample of size m from the distribution of U; we have 


‘er ~ ° . e an 
[Uae titins *** » Meet thas *** cee te @ 1, ++ , mi. 


Letting the observed values of U and of the factors be column vectors in m- 
dimensional Euclidean space R", the sample yields {U; %,,---,2p3 H1,°°-, 
j,}. For given values of the factors, U is distributed in a spherically symmetric 
normal distribution about the point >> 8,¢, + >> T.g, with variance o”. Similarly 
a sample of size x from the distribution of U* is given by {U*; #1, --- , Z3; 
jt, ---, 92} which is a set of vectors in R“. U* has a spherically symmetric 
distribution about >> 8fz, + >> Ty, with variance o** . We assume for each 
sample that the vectors representing the factors are linearly independent. 
Consider the following orthogonality conditions on the vectors which record 
the observed values of the different factors: #, , --- , J, are mutually orthogonal 
and orthogonal tod, --- ,%»3;91, °°: , j¢ are mutually orthogonal and orthog- 
onal to @ , --- , #,. The problem can always be put in this form by replacing 
the given factors by suitable linear combinations thereof. For consider a set 
of independent vectors @,,---* , Z)5 91, °°: , J¢. By using instead the vectors 
Bi,¢++, Bp3 fiz, *** , Jq2 Where g,., is the regression residual after fitting 
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i, --:, ¥, we have orthogonality between the Z, set and the g,.. set. Doing 
similarly for the 7*, 7* we have 


sé +H =D vd, +d Tihs 
>, prt +> T39? > yra* +> T? its 


for suitably defined y,, y?; and the hypothesis H, remains unchanged. 
Now replace the vectors g,.2 (giz) by linear combinations thereof, say 
Doe Gui. s + Quije,), Where the transformation matrix || a, || is chosen so 
that both | 9:.2%..2|) and || 7/292, || become diagonal matrices. The vectors 
>>. dus, (t = 1, --- ,q) are then mutually orthogonal and orthogonal to the 
#, ; similarly for >>, a.g*, and the #7. Also the hypothesis H, is unchanged 
since the same linear combinations are used for the vectors g,.2 as for Jr. 

For our problem P,, we summarize the notation and structure as follows. 

I. Uis normally oe in R”™ with 

(1) E{0}) => 2.184,+ > 517. , 
and 

(2) Covariance Matrix = oJ (I is the identity matrix), 
where 

(3) j:, °°, J form an orthogonal set and each is orthogonal to the space 

generated by 7,,--- , >. 

U* is an dist ributed in 2” with 

(1) E{U*} = 307, rar + Dot Tg 
and 

(2) Covariance Matrix o°7/, 
where 

(3) 91, ---, 97F form an orthogonal set and each is orthogonal to the space 

generated by f1 ,---,d. 

The hypothesis to be tonted } is H, = {T, = T3 ;8 = 1,---, q}. 

2.2. Examples. The following setiaies of the above structure have been 
treated in the literature. 

ExampLe 2.1. The original Behrens-Fisher problem [3]. This case is charac- 
terized by p = 0,¢ = 1,97 = (1,---, 1) and g* = (1,--- , 1). Within each 
sample the mean is constant and the hypothesis to be tested is that the samples 
have the same mean. The solution in this paper reduces to that given by Scheffé 
{2}. 

ExampLe 2.2. The problem a by Barankin [1]. For this problem we 
bene = 8 a= l,z =(i,-- lg=(a- > & m,---,&m — 2k m)> 
%* = (1,---, 1) and #* = (m — © » i/N,°°*,%m — ae n/n). The hypothesis 
HT is that the regression coefficients on £ and 9 are equal. The test proposed re- 
duces to that given by Barankin; however the properties proved are different. 
In this paper the test is shown to be most powerful or most powerful unbiased 
in a general class of similar tests. Barankin showed the test was most powerful 
among tests based on certain linear combinations of the variables (see [1]}). 
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ExampLe 2.3. Problem discussed by Chand [4]. We have p = 0, q = 2, and 
ii, G2, G1 , 2 are respectively the vectors Z, j, @* , g* given in the previous 
problem 2.2. T; and T. are a and 8 in the notation used in [4]. The hypothesis to 
be tested is Hy = {a = a*, 8 = 6*}. It is a different problem to test the hy- 
pothesis H = {a’ = a;*, 8’ = B’*}, where 


E{U} =a’ + 8, E{U*)} = a’* + p’*z*. 


This can be treated in the present formulation by finding au, ae, du, a2" 
such that ay(1, ---, 1)’ + ay, an(1, ---, 1)’ + aeé are orthogonal and 
ay (1, OP ae 1)’ +. Qy2t*, ax(1, eT 1)’ +- A2t* are orthogonal. 


3. Structural simplification. 

3.1. Linear transformations. Assuming that we are dealing with samples of 
size m and n and with given observed values of the factors involved, we introduce 
the following transformations in the sample spaces. These transformations con- 
siderably simplify the formulation of the problem. 

In the space R” of the first sample, we introduce an orthogonal transformation 
with matrix A. Let the components of a vector in the new m dimensional space 
be 1, ---, Lm; then L = AU = (1, ,---, Lm)’. We impose the following 
conditions on the transformation. 

(i) The space generated by %,--- , #, is mapped into the space generated 
by the coordinate vectors corresponding to l,,---, Lp. 

(ii) The space generated by 4, --- , J, is mapped into the space generated 
by the coordinate vectors corresponding to Lyi1, --- , Lpig. We require further 
that the vector gj, is mapped into [N(g )]’ = m times the coordinate vector of 
L,+1, J is mapped into [N(g%)]* = no times the coordinate vector of Ly42, and 
so on for all the vectors 9, ,--- , J,. This can be done since the vectors are 
mutually orthogonal and orthogonal to all the vectors . We use N(@) for the 
norm of a vector 7; that is, N(j) = 7’9. 

These conditions on A are equivalent to the following equation. 


1 1 


p 

P+! 

p+q 

m 
where all elements of the matrix except those in the indicated diagonal are zero. 
This can be satisfied by replacing the vectors 7 by an orthogonal set, normalizing 
all p + gq vectors, and using transposes of these p + q vectors as the first p + q 
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rows in the matrix 1. Complete A by adding m — p — q normalized orthogonal 
row vectors. 

The original assumptions and structure of the problem are transformed by 
A to the following. 

II. Z is normally distributed in R™ with 


(1) E(h) =n,°°°,; E(L,) = vp; 
E(Lyox1) = Tim,--- E(Ly+¢) = Te%; 


E(Lp+q+s) = 0,2 = 1,-+-,m—p—gq, 


and 
(2) Covariance Matrix oJ. 


The constants y,--- ,¥Yp are independent linear functions of the constants 
8:,--:,8,. Since the ’s are unknown regression coefficients in this problem, 
so also are the coefficients 71, --- 5 Yp- 

Similarly we introduce a transformation A* in the space R”. The new structure 
for Lt , --- , L% is given by the above equations where each 7, y, n, L is given 
an asterisk. 

The hypothesis to be tested is H, = {T, = T7;s = 1,---,q}. 

We further transform the problem by the following transformation of the 
variables. 


— ba. = Lote _ Lowe 


p+ 
ext 
Ng ng 


* 
= L iby Ly +8 


’ 
Ng ng 


L, ,7**-—f 
2? “Dp 


Loteri ; ““/,m—p-q 
* . 
4p+erj ? 1,°°°,n2—p—@q. 


The formulation of the original problem P, in terms of these transformed 
. m y , >* *: 
random variables T,, 8,, V,, V- , 2;, 2; is: 


25 
III. 25; 7 = 1,---, m’ are normally and independently distributed with 
means 0 and variances o°. 
z+; j7 = 1,---, nm’ are normally and independently distributed with 
means 0 and variances o”. 
V.; r = 1,--:+, p are normally and independently distributed with 
means y, and variances ao”. 
Vesr 1,---, p are normally and independently distributed with 
means y* and variances o°. 


(T; , S,) is normal with mean (d; , a) and covariance matrix 
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9 


2 * 92 2 * ¢ 
Co + Cio ag =~ ie 


9 9 ® 


2 * 92 2 * 92 
Cio — cio co + cio 


(T,, Sq) is normal with mean (d, , a,) and covariance matrix 


9 72 


2 * *%2 2 * 2 
Ceo + Co Coo — Ceo 


Cao” — ca” Co +e 
A priori Hypothesis: ¢; , cf , «++ , ¢,, ¢; are known positive real numbers; 
oo, My Ynys Yps Ye %,°** 4 @ are real numbers with values 
unknown. 
Hypothesis H, to be tested: |di = d, = --- = d, = O}. Alternative 
Hypothesis: {(d,,--- ,d,)e DC R*; (0,--- , 0) 2D}. 

Note that m’ = m — p — q,n’ = n — p — q. Also the y’s, y*’s, a’s, and d’s 
are simply related to the original regression coefficients 8’s, 8*’s, T’s, T*’s in terms 
of constants which depend on the values of the original regression variables. 

3.2. Reduction of the class of similar regions. We look for a general class of 
similar regions under the hypothesis H,. The following theorem allows us to 
restrict our choice. 

THEOREM 1. Any region in R™”" sumalar of size a under variation of yi, +**, 
Yp>¥15°°* 5p has almost everywhere conditional size a gwen Vi,---,V,, Vi, 

Ve, 

The proof follows immediately from the completeness of the multiple La- 
place transform. 

Under the hypothesis H, and its alternative the conditional distribution 
given the V’s and V*’s is independent of the V’s, V*’s and y’s, y*’s. Thus any 
similar test most powerful against a simple alternative can be chosen independent 
of the V’s and V*’s. 

3.3. The principle of invariance. The reduction obtained from Theorem 1 can 
also be obtained on the basis of the principle of invariance, which in fact provides 
an additional reduction. Consider the following linear transformation 


V, =V, +. 
; ie Bans 
=Vr+ les, «* 
s= L, o> eg 
= S. + Me 
This induces the following transformation on the parameter space: 
Yr +2, 
r= ] owe 
1 +); Milind * 
s=mzl---, J 
Qs + Ms 


It is easily seen that the problem is invariant under such transformations since 
the values of the parameters {d,} are unaffected. Thus on the basis of the prin- 
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ciple of invariance we look for a test which is independent of the values of the 
V’s, V*’s, and S’s. 

Our problem has now been considerably simplified; it can be summarized as 

follows. 

IV. 71, °°: ,7Tg3%, °°, 2m 322, °°*, Za are normally and independently 
distributed with means d,, --- , d,; 0,--- , 0; 0,--- , 0 and variances 
as +ae,**:, co + ao” go, +++,o30',°*:, 0 . The values 
(1, C1, °** ,€g,¢s are prescribed, and o’, o” are unknown. The hypothe- 
sis to be tested is: H, = {d, = --- = d, = O}. 


4. Test for H; . In this section the Scheffé test [2] is shown to be most power- 
ful or most powerful unbiased in a general class of similar tests. 

4.1. The class of svmilar regions. The problem P; may be slightly modified 
by changing the scale of the z’s by a factor c and the z*’s by a factor c*’. Letting 
co’ and c*a"” be respectively o° and o” , we have Py: T; 2; 3; 23(¢ = 1. --- 


‘. 
m'; 


j = 1,-::+, 7’) are normally and independently distributed with means d; 0; 0 
and variances 0 + 0”; 0°; 0”. The hypothesis is H, = {d = 0}. 

We now define a class of regions similar of size a for the probability distribu- 
tions of P; assuming H ; the distributions have d = 0 and no restrictions on 
o and o”. Assume m’ < n’ without loss of generality. Considering (7; z; ; 27) 
as a point in (m’ + n’ + 1)-dimensional Euclidean space R™ ~"'™ 
hyperspherical cylinders of the following form: 


, 


, we form 


\ 


Rp (r) = yr 26323) | (2, +> aiS’23) + T’=r°>. 


i=l j=l ) 


Here |! a{f? || is ann’ X n’ orthogonal matrix and P indexes it within the class 
P of all orthogonal n’ X n’ matrices. 
On the hyperspherical cylinder ®p(r), consider hyperplanes of the following 
form: 
, i - ) 
Stele, << ,030 « yr 2323) a+ das z) =o, 04+ =r’). 
t= 


i=l ) 


yr ” ° ° | ” e » ’ . 
We construct a class @. of regions in R™ ~" *". C2 consists of all sets S having 


the following structure: S = U,,x1 Unig S-p, where the sets S,p satisfy 

1. S,p © Rp(r) is measurable in the space Rp(r). 

2. S.pn S,p = unless r = 7’, P = P’. 

3. &* is a countable subset of @. 

4. If Upcpe [Sip 2 Ke(r, +++ , Cm 3 7)] ¥ ¢, then using Lebesgue measure 
d over the Euclidean spaces #p(q, , «++ , Cm 317) 


7 perdry = [ pdx, 
Pep? “HP ICP 


where p is the given probability density function, and gp is the characteristic 
function of the set S,p . The integrals exist for almost all values of r. 
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5. If m is the uniform normalized measure over the sphere §, 
al F 2 y2 2): s’ +1 Y : a 
{(., +++, em, 7T)\2 1c + T =r} in R” ™, then m(S,) = o@ for almost 
all values of r where 


S, = {(c, oe Cm’ ; T) Uregs [S,p n KHe(ci, nd ae Cm’ ; r)| = >}. 


Let @. be the sets S ¢ @% for which Sp = U,,x-S,p is measurable for P ¢ 0*, 
and let @. be all sets which differ from those in @’ by sets of measure zero. 

t.2. Proofs concerning similarity. Theorem 2 in this section establishes that 
€, is a class of similar regions. Theorem 3 shows that any similar region satisfy- 
ing Condition 4 above satisfies Condition 5. Also for any measurable region 
there is a region differing by a null set for which sets S,p can be defined satisfying 
Conditions 1 and 2. Thus, if @, is the class of all similar regions of size a, a proof 
of this would only need to show that for each S the sets S,p can be chosen to 
satisfy Conditions 3 and 4. 

THEOREM 2. Each region belonging to C, is similar of size a urth respect to the 
measures of P,(H,). 

Proor. Since the measures of P; are dominated by Lebesgue measure, we 
need only establish that C4 is a class of similar regions. 

Let Se @y and let gs be its characteristic function over R” *”’**. Since S 
also belongs to C4 , let S,p and @* be as given in Conditions 1-5. Define Sp by 
Sp = U,.2 Sp, and let op be the characteristic function of Sp. Since 
S = Un.g.Sp , we have gs = Dopep+ vp . We also define a characteristic function 
g(t, °°: , Cm ; R) as follows: 


¢(c, oe * . ta? « R) = ] if U 1Sax n SC pC). “s* . Gals r)] xo 


Pe yp 


=: if U (S.p n P(e, =e 


Pep 
We calculate the probability measure of the set S, letting p stand for the p.d_f. 
under P;(H,). 


us) = | esp aT [dz Tl det =f (Sen)par I] de I] ae" 
Rmetne +1 1 j Jame tne + P < ; 
Zz / gp padT [I dz; [] az? 
P JRmtne +i ; F 


ei dr | A(r) dm| 
P JR’ S, JIC Pler.-**.¢ 


¢ mir 


per dr 


pep dr 


| dr | A(r) dm > | 
' Sr PI 5C pler.*+,em/i) 
= | dr | A(r) dme(ei, -** , Cm? 3 7) | 

cdg FCP Cer. emir 


= [ ar [ APOC) ©" 5 Gxt SF) Bas. s+>sae sy 
R’ ‘ 


r 
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where A(r) is the area of S, and where we let 


Prey .++* emit - | Pp dy, 
JC Ple1s*** 1m ?sr) 


that is, the marginal probability density of ¢,---, cm, T. This density is 
easily seen to be a spherically symmetric normal density centered on the origin 
with variance o° + o°. Since it is constant valued over spheres centered at the 
origin, let T(r) = A(r)pe,,...,em-;7 and we obtain 


u(S) [ I(r) dr [ g(t1,---,¢,.;7) dm 
R’ Sr 


/ I'(r)m(S,) dr - | T'(r)a dr = 
R’ R’ 


Hence the probability measure of S is a, and is independent of the parameters 
o and o? of the distribution. This completes the proof. 

THEOREM 3. If the region S is similar of size a and satisfies Condition 4, then 
S satisfies Condition 5. 

Proor. Using the notion of Theorem 2, we have 


a = w(S) = / | ,.. esp aT II dz; [I ae}. 
Rm! tn’ +1 ‘ j 


All steps in the previous theorem remain valid under the assumptions of this 
theorem except the last two. The value of the integral g(t, ,°°* , Cm 37) dm 
Ss 
may now be a function of r, say u(r). We have therefore a = [ I'(r)v(r) dr. 
a 


Since the Gamma densities form a complete system of functions, with param- 
eter o + o 2, v(r) must be equal to « almost everywhere. But since this implies 
the fulfillment of Condition 5, the theorem is proved. 
4.3. Choice of the test. From the class @, of similar regions, we wish to choose 
an optimum test of the hypothesis H, . For this we require the following theorem. 
THEOREM 4. For any region of the class Ca , there corresponds a region in Ca 
with the same power function and having the structure 


Cm’; T)eCe KHp(cy p 2° * par 3 Ff) 


where C, is a subset of 8, for each r. 

Proor. Let S’ ¢ @, . From the definition of @, , there exists a measurable set 
S” © @2 which is essentially equivalent to S’. 8” satisfies Condition 4, which is 
a relation to be satisfied for each c, , --- , Cm, 7. But e,,---, Cm, being fixed 
implies T is fixed. 

Under an alternative hypothesis the p.d.f. “‘p”’ is changed by a factor depend- 
ing only on o + o°, 7, and d. This factor is a constant over all points in the 
ranges of integration in Condition 4. It follows then that Condition 4 being 
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true on the hypothesis H; implies that it is true under any alternative hypothe- 
sis. 


To construct S we arbitrarily choose a P € ®* and using the structure of S” 
given by Conditions 1-5, we choose 


Cr = (lr, -** sem? 5 1) AY [Sre 9 Sp(er, +++, Cnr 57) ¥ 4, 
Pep 


Ss = U U Hp(er, oo% 5 Cm ;r). 
TER’ (C1,°**, Cues T) EC, 


It follows easily that Conditions 1-5 are fulfilled for the set S. Therefore S ¢ C% . 
To show that S « C, we must prove that S is a measurable set. S is a cylindri- 


. . x . ° m’ +1 
cal set with a cross section or base set S defined in R™ *", the space of c; , --- 
Cm’ ; 7. 


= {(e),°++ , em 3 T)|SQ Rp, +: , Cm 57) ¥ Oh. 


It therefore suffices to show that this base set S is measurable. This can be estab- 
lished by the following simple argument. 

For the region S”, define as in Theorem 3 regions Sp and characteristic func- 
tions ¢” and gp. Then 


us") = | ep ar I de TT ae} 
m’ +n’ +1 é j 


Ye. eer aT UL de, Tae} 
gm ta +1 i j 


Pep? 


| dr | A(r) dm 7 [ gr p an. 
reR’ Sr Pe? SAC plese scmrir) 


Then applying Condition 4 we have 


u(S”) = / ar | A(r) dm ela, +++ . Cm’ 7) | p dn 
reR’ Sr SCP (eit Omir) 


= | dr | A(r)e(a,--- i) 
reR’ Sr 


m 
= | Ps 8 Gat TF Bae aT [J de,. 
am’ +1 1 


But since w(S”) is equal to this last expression, it follows that ¢e,,....¢,-;7) 18 
measurable. However, g(c,, --- , Cm 3 7) is the characteristic function of the 
set S and therefore S is measurable and S e€ @. 

We have now only to establish that the power function for the region S is 


identical to the power function of S”. We shall need to know that Condition 4 
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is fulfilled under alternative hypotheses; this we have already shown. Letting 
P be the p.df. of P,(H, or Hi), we have 


u(S”) [ sg" aT T] de, T] ae? 
Rm’ tn’ +i i j 


[ a [ A(r) dm >> I gr pdr 
RK S- Pe@* “FC plei.:** emir) 


[ ar [ A(r) dmg(a, +++ , m3 7) p dr 
A Sr 


BC Pleis + emir) 


[ af A(r) dm | gr(S)p dd 
R S- BCP Cer. *** emir) 


foc soces ee(S)p a7 TI de TI ae} = u(s). 


This completes the proof. 
Thus for tests of size a with critical regions chosen from the class C, , we can 
get as good tests by confining our attention to critical regions of the form 
S= VJ U Hp(c,, °+* , Cm 57). 
reR’ Cyt tem TT) lA 
» ° ° * ° ° m’ +1 
tegions of this type are cylindrical with base in the space R™ “of : , +--+ ,¢m, T 
and axes the coordinate axes of the remaining variables. Whether a sample 
. . m'’ + “1 ’ . . 
point in R™ ~™ ™ belongs to S can be ascertained by observing whether 


Ci, °** » Cm, a: 


falls in the base set in R" “'. It is worth noticing that the size and power of 
regions of this type are independent of the particular P used in the defining 
equation above. 

Our problem has thus been reduced to the following: P; T, ¢;,--- , Cm are 
normally and independently distributed with means d, 0, --- , 0 and common 
variance o +o . The hypothesis is HW, = {d = 0}. This is the simplest ‘Stu- 
dent’ problem. 

The choice of a similar test for this problem P, depends on whether the al- 
ternatives are one- or two-sided and will be the usual ¢-test, one- or two-sided. 
Thus among similar tests of H,; with critical region restricted to the class @, , 
we have found a most powerful or most powerful unbiased test. This test is 
identical to that proposed by Barankin [1]. 


5. Calculation of the Test Criterion for P,;. Rather than give an explicit ex- 
pression for the test criterion, we describe in terms usual to the analysis of 
variance the procedure for determining it. This will avoid errors in substituting 
in an unwieldy expression and the possibility of typographical errors in such ex- 
pressions (for example, in formulas (37) and (39) in [1] the sign of the second 
term in the denominator should be negative). 
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We need first the linear function contained in the numerator of the ¢-criterion. 
This is to within a constant factor the difference between the regression coeffi- 
cient for 7 in the first sample and the corresponding coefficient in the second 
sample. These coefficients are independent of whether the 7 vectors have been 
orthogonalized to the corresponding Z vectors: we need only the coefficients in 
the joint regression. Calculate the variance of the difference between these two 
coefficients: it will be of the form ao” + be” . 

For the calculation of the denominator, we distinguish two cases. In each 
delete n — m members of the second sample. = 

I. The vectors j, %,--- , % generate the same space as 9*, F , +++, #,. In 
the two samples of size m fit by regression the vectors 9, 71, +--+ , , (or equiva- 
lently g*, #f , --- , #5). Proceed as in the analysis of covariance; calculate the 
sum of squares (SS) of the U’s, the sum of products (SP) of the U’s and V’s, 
and the sum of squares of the V’s. Then using the regression coefficients as ob- 
tained from the samples of size m, calculate SSv , SPuy , and SS, for residuals. 
The denominator of the test criterion will be: 


(5.1) / Sa 2(ab) SP, v > bSSy 
, m—-p-— i 


’ - o = o o% =o 
II. The vectors §, % , «++ , , do not generate the same space as 7*, % , +--+ , Ep 


do. Choose a linearly independent set of vectors @, , --- , ®, which generate the 

e . a as = a _* é 
space spanned by the combined set of vectors j, 9*, 71, -°-- , ), =, and which 
in addition satisfy the conditions: 


@,,°** , Woy. generate the space spanned by J, i, --- 

Wp, °*: , @ generate the space spanned by 9*, Z, --- 

As a consequence of these conditions 7», --- , @p41 generate the intersection 
of the spaces mentioned in the two conditions above. Calculate the regression 
coefficients obtained from fitting the vectors @,,--- , @; to the first sample. 
In doing so, fit the first (p + 1) w’s and then record G++, a , the succes- 
sive amounts by which the sums of squares of residuals is reduced by fitting in 
addition @p42,--: , @ one by one. Let SSvy be the final sum of squares of re- 
siduals. Repeating the above procedure for the second sample of m, but using 
the @ vectors in the order %,, We1,°--, @, obtain EY, +s ; nts SSy. 
Calculate the overall sum of products and using the two sets of regression co- 
efficients on all the @’s, obtain SPyy the sum of products of residuals. The de- 
nominator of the test criterion will be 


/ t—p—l : 


/ ” 
(5.2) / aSSv + 2(ab)*SPuv +bSSy+ D> (a’l; + bT)? 


V m-p-l a 


* ° ° ° ° ° ne 
where 1; , 1; are given the signs of the corresponding regression coefficients. 
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It is easily seen that formulas (5.1) and (5.2) give the denominators for the 
optimum test criterion as described in Section 4. 


6. The problem: P,, q > 1. No attempt has been made to obtain a general 
class of similar regions for P,(q > 1) except in a very particular case described 
below. The asymptotically best test in the sense that the F ratio is best in the 
analysis of variances (see Wald [6]) is, however, immediately available. To ob- 
tain this, calculate a t-criterion for each of the q regression coefficients, in each 
case putting the remaining q — 1 vectors with the Z vectors and proceeding as 
in Section 5. Each ¢’ has asymptotically the x; distribution and is asymptotically 
independent of the other f’s. Combine the q f’s to obtain a criterion having 
asymptotically the x; distribution under H, . Use large values of x° to form the 
size @ critical region. 

In small samples a similar procedure is available if each numerator for the 
i criterions has the same ratio b/a = k where a and b are given in Section 5. 
Divide each numerator by the corresponding a’, and square and add to obtain 
the numerator of an F-ratio. For the denominator use the square of the t-ratio 
denominator where a and 6 are given the values 1 and k. This /-ratio will have 
the F,.—p-q distribution under H, and the usual power function under the al- 
ternatives. The procedure in Section 4 extends immediately; a general class of 
similar regions @, exists and its construction is almost identical to the P; con- 
struction. Among tests having regions belonging to @, , a best test in the Wald 
sense [6] exists and is the one described at the beginning of this paragraph. 

If the ratios ba are not constant corresponding to the q pairs of regression 
coefficients, the following procedure gives an exact test for small samples. Calcu- 
late the numerators for the ¢-ratios as described in Section 5. Also calculate the 
sums of squares and products as used in formulas (5.1) and (5.2). Partition these 
sums of squares and products into q sets having ; , --- , m, degrees of freedom 
(Son; = m — p — q). Next evaluate q ¢-ratios using for the ith denominator the 
expression 


ma“ 7. < + a * a fk ean ae 2. 
Y a,SS(n,) + 2(a;b)'SPry(ny) + 6; SSy(n,) 


ny 


From each (; using F tables with 1 and n, degrees of freedom, calculate p,; the prob- 
ability of a larger value of {; under H, . Under I, each p, will be uniformly and 
independently distributed [0, 1]. x° = >! — 2ln p; can then be used as a test 
criterion; under H, x” has a x’ distribution with 2q degrees of freedom and under 


alternative hypothesis large values of x° become relatively more likely. See 
Fisher [5]. For tests of this type the power curve along the axes of d;,--- , d, 
in the parameter space depends respectively and only on m , --- , % (see Baran- 
kin [1] p. 435). Using considerations similar to those used to justify the principle 
of invariance, it seems reasonable to maximize the lowest curve; that is, we 
choose the smallest value of n,; as large as possible. We therefore choose n; = 
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[m — p — q/q] for as many 7 as possible subject to the remaining 7, satisfying 
n; = [m — p — q/qi) + 1. 
Other possible ways of combining individual tests are discussed in [7]. 
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Summary. In a recent paper [1] the authors began the study of the theory of 
sequential decision functions for stochastic processes with a continuous time 
parameter. This paper treated the standard problem of testing hypotheses, and 
the advantage of being able to stop at an arbitrary time point (not necessarily 
a multiple of some unit given in advance) was demonstrated in several cases, 
notably in that of deciding between two Poisson processes. The optimal tests 
were Wald probability ratio tests and thus truly sequential. In the present paper 
we treat the problem of estimation, and study in detail the Poisson, Gamma, 
Normal and Negative Binomial processes. It turns out for these processes that, 
with a proper weight function, the minimax (sequential) rule reduces to a fixed- 
time rule. Though we confine ourselves to point-estimation it is clear that similar 
methods apply to interval estimation. It may also be remarked that the case 
when the time-parameter is discrete need not be treated separately. For example, 
as described in Section 6.1, the results of Sections 2 and 3 imply analogous results 
in the case of discrete time, which in turn imply certain results proved in [3] 
and (in the nonsequential case) in [2] by other methods. The treatment of some 
other problems in estimation is discussed in Section 6. This paper may be read 
independently of [1]. 


1. Preliminaries. Let X(t | w),¢ 2 0, we, be a family of stochastic processes 
in time ¢ which depend on a parameter w. Let c(t), t 2 0, be a given cost function 
which represents the cost to the statistician of observing the process up to time ¢. 
For every w in Q and @ in the terminal decision space D* let W(w, &) be the weight 
function, that is, the loss involved in giving the estimate @ when w is the correct 
value of the parameter. Let (7', 6) be a pair of functionals of the sample function 
x(t) into (0 S T S «, D), where 6 depends on z(t) only through its values for 
O<t< Tif T < @ (if T = ©, 6 is undefined, but in accordance with our as- 
sumptions on c(t) below we define the quantity of (1) to be © if this event occurs 
with positive probability under w). The decision rule corresponding to these 
functionals is: observe up to time 7 and then (in case T is finite) adopt the esti- 
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2TIn Sections 3 and 5, we take 2 = D. In Section 2, D = 2 + {A = QO}. In Section 4, 
D=2+ fo =0}) =2+ {p= 1}. 
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mate 6. If 7 is a constant independent of the sample function x(t) then the 
procedure is not truly sequential. It is called a fixed-size or fixed-time estimation 
procedure. Throughout this paper we shall by x(t) mean x(t+); that is, the 
sample functions are to be considered as continuous from the right. 

For a given w the risk associated with such a procedure is defined by 


(1) R(T, 6) = E.te(T) + Wa, 6)} 


where /, denotes the expected value under the assumption that w is the true 
value of the parameter, provided the expected value exists. Assuming the ex- 
pected value to exist for every weQ we define the maximum risk associated 
with T and 6 by 


(2) R(T, 6) = sup R.(T, 4), 


the supremum taken over all we Q. 
. . one A . . . * 
An estimation procedure (7', 6) is called minimaz if 


(3) R(T, 6) < R(T, 6) 


for any functionals 7’ and 6 for which (2) is defined. If no minimax estimation 
rule exists, it is still possible to define a minimax sequence of decision rules 
tT. ‘ 6,.(n = 1,2,--- ), that is, a sequence for which 
(4) lim R(T, 5,) = inf R(T, 8). 
names 

In the cases we treat it will be shown that a minimax rule does exist. However, 
a slight relaxation of the assumptions (e.g., dropping the continuity assumption 
about the cost function c(f)) may affect the existence of minimax rules, and in 
such cases one has to modify the argument only slightly in order to find a minimax 
sequence (which, in the cases treated below, may be taken to consist of fixed-time 
rules). 

Let ¢ be a Borel field of subsets of Q and R,(6, T) be a measurable function of 
w with respect to ¢. Let /(w) be a probability distribution on 2. Then, assuming 
the integral to exist, we define 


(5) R,(T, 6) = | R(T, 5)dF (w). 


The estimation rule (7 , 5¢) is called a Bayes rule for F if 
(0) Rp T', ° bp) = int Ri i 6). 


We shall denote by 6° fixed time estimation rules with constant observation 
time 7’. and in this case we shall write 6” instead of the pair (7', 6). We define 
(7) r.(d') = EW (w, 8’) 


and 


(8) rw | r(6') dF(w). 
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5; is called a T-Bayes estimation rule for F if (8) assumes its minimum for 6” = 6p. 
Let A be any set of sample curves x(t) for which the probability P{x(t) « A} 
is defined and is a measurable function of w. Let F(w) be any distribution function 


over 2. Then for every A for which P(A) = [ P{x(t) e A}dF(w) > 0 we define 
Q 


the a posteriori probability distribution F(w|A) by assigning to every Borel 
set Set the probability P(A) [ Pico € A}dF(w). The a posteriori T-risk 
8 


° » eT : . 
corresponding to F and 6° is defined by 


(9) rp(87 | A) = [ ra(8") dF(w | A). 
2 


If re(6’ | A) is independent of A we say that the a posteriori risk is independent 
of the sample x(t). (It is assumed in the sequel that “many” sets A with the above 
property exist. This is of course the case for the processes usually encountered 
in mathematical statistics and in particular with families of processes, like those 
treated in this paper, with which are associated sufficient statistics of a simple 
nature. Since our primary interest is in statistical applications there seems to be 
no point in inserting a lengthy technical discussion of the precise measurability 
properties required in order to insure that the class of sets A will be sufficiently 
rich.) 

We shall make frequent use of the following obvious remark, which is a familiar 
tool in decision theory (see, e.g., [4]). 

Suppose that, for every T = 0, there exists a sequence of probability distributions 
F,(n = 1, 2, +--+ ) for which there are corresponding T-Bayes solutions 5}, with the 
property that the a posteriori risk associated with F,, and 67, is independent of the 
sample x(t), and suppose that there exists 8” for which 


(10) r(T) = sup ro(57) = lim rp,(67). 


t=O 


If there exists a Ty (O S To < @&) for which 


(11) c(T >) + r(T>) = min [ce(T) + r(T)] 
T>0 


. . 4T . 2 ° ‘ 
holds, then the fixed time rule 6°° is a minimax estimation rule. 
The proof of this assertion is evident. Indeed, the conclusion remains valid 
under weaker assumptions. As this is not needed for the sequel we just point out 


that we could have dropped the assumption of risk independent of the sample and 
replaced (10) by 


sup ro(67) = lim inf rr,(57, | A). 
w n=0 A 
It may also be worth while to remark that if no To satisfying (11) exists, we still 
have a minimax sequence of estimation rules all of which are fixed-time rules. 
(These results clearly remain valid also if randomized rules are considered.) 
In the examples treated in the sequel r(7’) is a nennegative continuous function. 
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We assume that the cost function c(7') is nonnegative, lower semicontinuous, 
and tends to infinity as 7 —> ©. These assumptions guarantee the existence of a 
7’) which satisfies (11). 

We remark, finally, that in the examples of Sections 2 and 3, the minimum of 
(11) will always be achieved for T > 0, since R(O, 6) = & for all 6. In the ex- 
amples of Sections 4 and 5, this need not be the case. Analogous remarks apply 
to the discussion of Section 6. 


2. The Poisson process. This is defined for every \ > 0 as a process X,(() 
with independent stationary increments which satisfies 


(12) P{X\() = ——e (c = 0,1,2 


for all ¢ = 0. 

We let @ be the half-line 0 < \ < © and ¢ consist of the usual Borel sets.” 
Our problem is to estimate the mean X. It is well known that x(T) is a sufficient 
statistic for \ when the sample curve x(t) is observed for 0 < ¢t S T. 

As weight function we take, following Hodges and Lehmann [3] and Girshick 
and Savage [2], 


(13) WA, 6) = < (8 — )*. 


This is the squared error measured in terms of the variance. As these authors 
point out, the classical squared error (6 — \)° gives, for every finite time, infinite 
minimax risk, and is thus of no interest unless some additional information about 
\ is known. 

Let F(A), nm = 1, 2, --- be the probability distribution on the half-line \ > 0 
with density 
(14) fd) = * ¢ on (0<r\ < ~). 

t 

Let the process be observed during the time 0 S ¢ S T. The a posteriori prob- 
ability distribution when x(T) = x is well defined and its density is given by 


r z+l 
fald | x) =n s (7 + =) oe hr +l n (0 < d < 0 ), 
tL: t 


The a posteriori 7-risk (see (9)) is given by 


re. St oa 
r(d" |x) = | = (87 —d)*fa(A| 2) AA. 


“0 / 


It is easily seen that this is minimized by taking 


6'(x(T) eee i/ | 5 Lal *) a a r =i ‘hi ‘ 
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Therefore the 7'-Bayes solution corresponding to F,,(A) is given by 
eau 4 aT 
(15) Fo aT) 

7 + l/n 


The corresponding a posteriori risk is 


(16) elle ai ot a? 


Php | n’ 


which is independent of 2(7’) (hence, x(7’) being a sufficient statistic, of the 
sample). 
On the other hand, taking 


(17) 
we see that 


pe e 4 ie z ] 
r(6') =e Ty 2S ? = 
z=0 x: 7 
for all A > 0. Thus we have the following. 
For the Poisson process (12) withO < \ < © and weight function (13) the fixed- 
time estimate (17) with T’ = Ty given by 


(18) e(T) + 7 = min | ecr) + 5 


T>0 


7s minimax. 


3. The Gamma process. This is defined for every pair of positive numbers r 
and @ as a process X,o(¢) with independent stationary increments such that 
X,¢(0) = O and for every t > O and x > 0 


rt—1 
r 


Pie 


—z/@ 
p dx 


(19) P{X,.(t) < z} = [ 
“0 


The parameter r will be assumed known, and the space 2 will consist of the 
half-line 0 < @ < ~, the Borel sets being the ordinary ones.” Here again it is well 
known that if the sample curve x(¢) is given only for < T then 2(T) isa sufficient 
statistic for @. 

As weight function we take 


(20) W(0, 8) = | (3) “ i], 


y being an arbitrary positive number. For y = 1 this weight function, like (13), 
is proportional to the square error of the mean measured in terms of the variance 
and occurs in Hodges and Lehmann [3] and Girshick and Savage [2]. 
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Let F,,(6),n = 1,2, --- be the probability distribution over 6 > 0 with density 


—1—ljn —1/6 
6 


2 a ae (<9 ) 
( 1) Sn(0) r'(1/n) {0 < < x 


The a posteriori probability distribution when x(7') = z has the density 


(x + rr a, Gt le 


22) (8 | x) = ———____ 
( ) J \ i v) (rT + L/n)orrtitiin 


(O<0< 


The a posteriori T-risk is given by 
n e\ 7 2 
(23) ra(8" |x) = | | (2) i | fa(O | x) dO. 
“0 
It is minimized by taking 


0 2 l/y 
(24) 67(2(T) = z) | 0 "f,(0 | x) ao / | a **f,(0 | x) ao| 
“0 


[Tn nny a / ) l/y 
(r+ 1] ROT ++ Vn) | 
LY(rT + 2y + 1/n) 
Substituting this value in (23) we obtain 
GARE ik 000 = eet ee ee tir 
U(rT + 1/n)P(rT + 2y + 1/n) 
which is independent of «(7’). 
On the other hand the estimator 
A r rT ily 
(26) Pr wi tte | x(T) 
LI (rT + 2y) 


gives 


(27) ity on 1. OT ty 
2 | POTCT + 2%) 


for allO0 < 6 < «.Since (27) is independent of 6 and is the limit of (25) asn — © 
we have: 

For the Gamma process (14) with fixed r, unknown 6 (0 < @ < ©) and weight 
function (20), the fixed time estimate (26) with T = To» given by 


‘ a I'(rT. + Y) | wr) . —_' CT +) | 
DS . 0 — . ov @ we — Wee 
(28) (T0) — TERT. + 2) OLA” — eENrET + By) 
is minimax. 


If instead of using the weight function (20) we use, following Girshick and 


Savage [2], 


— ; 26 
(29) W (6, 6) = log” a 
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that is, the squared error when log @ is considered as the parameter, then we find 
that for the distributions (21) the a posteriori T-risk is minimized by taking 


(rT + 1/n)\ 
r(rT + 1/n) | 


t)= exp | log 6f,(0 | x) dé = (x + 1) exp <— 
0 \ 


and that the corresponding value of the a posteriori risk is 
“(rT + 1/n) -| ie + ua) 
r FE + 1/n) (rT + 1/n) 
which again is independent of x. Since 
(30) &” = x(T)e 


has the constant risk function 


— « nor | 
(rT) T(rT) |’ 
we have as before: 


If, instead of (20), the weight function (29) is used, then the fixed time estimate 
(30) with T = To given by 


— "(rT 9) rr = min del? r’ (rT) -|F (rT) 
(31) o(T) + + P(rTa) - [ESS = Ot tery ~ Lien 


is minimax. 


4. The Negative Binomial process. This is defined for every w > 0 as a process 
X..(t) with independent stationary increments satisfying Y..(0) = 0 and 


(32) ME « «ot 2 (x 


MW: + pasa 4 SS 


for every t > 0. It is customary to put 


: ] 
(35 = , 
33) Pp Pas 


then (32) becomes 


MX « sj «te pq (x=0,1,2 


Tora +1)! 

As 2 we take the half line 0 < w < «, the Borel sets being the usual ones.’ 
It is easy to see that (7) is a sufficient statistic for w when x(t) is observed for 
0st & Tf. 

X,(t) has mean wf and variance w(1 + w)t. It is easily seen that the square 
error (6 — w) would give an infinite minimax risk. We therefore use as weight 
function 
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‘ bas (6—w)’ p/ 
(34) W(6,0) = ———~. = + {6 
w(1 + w) q 

which is proportional to the square error measured in units equal to the variance. 

Let F,.(p), n = 1, 2, --- be the probability distribution over 0 < p < | with 
density 
ais ' (2+ 1/n) —-141/n 
35) f,(p) = ——_———— ( <x » < §). 
( I\P) ™ TO)TU/n) © : ' 
If the process is observed for the period 0 < ¢ S T and x(7') = x then the a 
posteriori probability distribution has density 


24+ 24 7 + 1/n) _r-sstia_s , 
Kol = tap orre im?  ~£ O<P<d. 


The a posteriori 7-risk is 


1 2 2 
ra(8" |x) = | P (3° ~ ‘) falp | x)dp, 
0 4g p 


and is minimized by taking 


1 
: I pf n(p | «)dp ee 


6'(2(T) = x) 


Srp | x)dp 


[ p ~ PIF in 
0 q 


Therefore the 7’-Bayes estimate corresponding to the a priori distribution F’,(p) 
with density (35) is given by 6, = (x + 1)/(T + 1 + 1/n). The corresponding 
a posteriori risk is 


A “ l 
36) r(o.|x) = .. 
( T+it+ i/n’ 
which is independent of (7) and, therefore, of the sample. 
For every given w,0 <w < «, the estimator 


_ r_ 2X(T) 
(37) Mek os 
gives the risk 


a) ( Khan Kl ES gf 0 Tt 

qzo\T +1 p/ P(T)T(«# + 1) (T + 1)?" 
The supremum of this expression for 0 S w < ~ is 1/(T + 1) by (33); that is, 
equal to the limit of (36) as n — ©. Hence (10) holds and the remark of Section 2 
may be applied to give the following. 
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For the Negative Binomial process (32) withO < w < & and weight function 
(34) the fixed time estimate (37) with T = Ty given by 


i . l . I 
(38) c(T.) + ——- = min} c(T) + ——— 
0 7 me. T+ 
is a minimax estimate. 
It may be worth while to remark that (37) is a biased estimate. Indeed the 
expected value of it for given wis Tw/(T + 1). The unbiased estimate x/T gives 
a constant risk 1/7. 


5. The Normal (Wiener) process. This is defined for every real u and positive 
o as a process with stationary independent increments such that X,.(0) = 0 and 


z 
P{X,,,(t) <2} = = f arn" & 
V 2rio J_x 
for every real x and t > 0. 
The parameter o will be assumed known and the space @ will consist of the 
real line —« < yp < &, the Borel sets being the ordinary ones.” 
As weight function we take any function of the form 


(39) W(u, 6) = w(| uw — 6!) 


with w(x) nonnegative and nondecreasing for x = 0. 

In the present case it is not necessary to perform computations similar to 
those of the preceding sections, since it is easily seen that the arguments of 
Wolfowitz [4], where a discrete time parameter was considered, carry over to 
the present case. 

The fixed time estimator 
(40) §7 = ae? 

7 


gives for 7 > O and any real uw the 7-risk 


AT) V 2T - —r?7/202 
(41) r(o) = —= w(ax)e dx. 
V nro -0 
Moreover it is easily seen that r,(6’) is, as a function of T, for T > 0, nonincreas- 
ing, continuous from the right and that 
lim r,(5") = lim w(z). 
T\0 


z—=x 


Disregarding the trivial cases 1) when 


(42) | w(ax)e*” dx 


/o 


is divergent for all h, that is, when the risk is always infinite, and 2) w(r7) =0 
when the value of the estimator is of no consequence, we have: 
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For the Normal process with fixed variance o, unknown mean w and weight 
function (39) with w(x), « = 0, not identically zero, nondecreasing, and such that 
(42) converges for at least one value of h, the fixed time estimate (40) with T = To 
given by 

ee S27 i P —22 Te /%2 
c(T >) + = : w(xe —- dx 


wo <0 


(43) baal 

— « ae 

= min] c(7) + va | wire * *'** ar | 

T>0 V nao 0 4 

as minimax, where the term following c(T) and c(T.) in (43) ts replaced by sup, w(x) 

for T = Oor Ty = 0, and where (40) may be replaced by any estimator if the 
minimum of (43) is at T = 0 (which can only occur if w is bounded). 

For further remarks on the normal process see the next section, especially 

6.4 and 6.5. 


6. Generalizations and other remarks. 

6.1. It is by no means impossible to have practically continuous observation 
of a stochastic process. However, our results apply without any modification to 
stochastic processes with a discrete time parameter or, more generally, to the 
case When the observations can be made only at times belonging to some set 
given in advance (there is, of course, no loss of generality in assuming the time- 
parameter continuous). Indeed, if J is any closed subset of the reals and an ob- 
servation at time 7’ may be made only if T ¢ J, all our results remain valid pro- 


vided Ty and T in (11) [respectively (18), (28), (31), (38) and (43)] are restricted 
by the condition that they belong to J. (The same end could be achieved by 


having c(t) = x for tg7 and dropping or suitably modifying the assumption 
that c(t) is continuous.) 


For the special case J = {0, 1, 2,--- | we have the usual discrete time case; 
if, furthermore, c(¢) is a linear function of ¢ we have the classical sequential case. 
In this classical case, Hodges and Lehmann [3] obtained by a different method 
the fact that the fixed-sample estimator of Section 2 on the Poisson process as 
well as the first estimator of Section 3 on the Gamma process (for the weight 
function (20) with y = 1), both with 7 = n, minimize sup.£.W(@, 5) subject to 
sup./..V Sn, where N is the number of observations required (a chance variable) 
and n is a given positive integer. These results are implied by, but do not imply, 
ours (see [5], Lemma 5); as remarked in [3], their method does not seem applicable 
to our problem. The Gamma process with the weight function (29) was con- 
sidered nonsequentially by Girshick and Savage [2] who, using a different method, 
established that (30) is a minimax estimator for the fixed sample size problem. 
As far as we know the Negative Binomial process has never been treated before 
even nonsequentially. 

6.2. The impossibility, in practice, of observing a process for a continuous 
range of time may be taken care of in the following manner. We replace c(t) 
by e(n; 4, te, ---, t,) which represents the cost of taking n observations at 
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times 4 < th <--- < t,. The function c(n; t;,---, t,) is assumed to 
satisfy appropriate conditions such as being nonnegative and satisfying 
cin + 1,4, °:- , bras) 2 C(ns th, --+ , tin, linn, «++, t,) forn = 0,1, --- and 
>= 1,2,---,n + 1. Our results easily carry over to this case. Thus, for ex- 
ample, for the Poisson process with the weight function considered in Section 2 
& Minimax estimation procedure is to take a single observation at time T = T) 
for which c(1; T) + 1, T becomes a minimum, and to estimate \ by 2(To)/T» . 

This modification of the problem may be combined with that of Section 6.1 
by considering only times belonging to a given set I. 

6.3. Another modification of the sequential estimation problem is the follow- 
ing: The statistician is required to estimate w continuously by a function 6(¢) 
which is a functional of the observed process up to time ¢, and the loss function is 


-% 


W(8(t), w)\dG(t) where G(t) is a monotone nondecreasing function. Our meth- 
0 


ods apply also to this modified problem. For example in the case of the Poisson 
process with the weight function used in Section 2 a minimax procedure is ob- 
tained by taking 6(f) = x(t)/t. 

This formulation may be combined with that of Section 6.2 by having a cost 
function c(n; 4, -+:, tr; , 71, °°*, 7») Which expresses the cost of observing 
the process at times 4, < --- < ¢, and changing the estimator 4(t) at times 
tT; < +++ < 7,. Here again if for every T we can find a sequence of probability 
distributions with 7-Bayes solutions for which the a posteriori risk is inde- 
pendent of the sample and an estimator by satisfying (10) we deduce that a 
minimax procedure is obtained as follows: choose n, i ,--- ,tn,U, T1,°** 5 Te 
so as to minimize 


(ns thy --+yte3%n1, +m) + | r(t) dG(t) 
“0 


where ¢ = max,,<,; (for r; St < 7541 (with 7.4, = ©), and estimate by 6(t) = 
5‘. It is easily seen that if ¢ reduces to a function of x and v only which is mono- 
tone in both arguments, then one can choose the 7, from among the ¢; . 

This modification may also be considered together with that of Section 6.1. 
We may further combine it with a weight function which is dependent on the 
time, ete. 

6.4. Throughout the paper we dealt with the problem of point estimation, 
but it is possible to treat similarly the problem of sequential interval estimation 
(including that of one-sided estimation). In particular, for the case of the Normal 
process the results of Wolfowitz [4] carry over to the case of a continuous time 
parameter. 

6.5. We would like now to make some remarks about a class of estimation 
problems hest exemplified by the problem of estimating the variance of a Normal 
Process. 

Let T; and T; > T; be any two nonnegative numbers and put t,.,. = Ti + (n/2”) 
(T. — T;) for m = 1, 2,--- ;n =0,1,--- , 2”. It is well known (see [1]) that 
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if z(t) is a sample function of the Normal process then 


om 

lim DO (x(tmn) — t(tmna)}l’ = o°(T: — Ty) 

m=an n=l 
with probability 1. Therefore if one could observe a Normal process without error 
for an arbitrarily short period of time one would know the correct value of o 
with probability 1. To make the problem practical it is necessary to modify 
the problem somewhat, for example, in the manner suggested in Section 6.2. We 
observe that if X(t) is the Normal process with mean yu and variance o° then, 
for every positive 7 and ¢, the random variable (1,/2¢)[X(7 + t) — X(T) — ut} 
has the Gamma distribution given by (19) with r = }, @ = o and¢ = 1. There- 
fore, if the mean is known and the problem is to estimate the variance we could 
apply the results of Section 3 (with the modifications suggested in Section 6.2). 
If both the mean and the variance are unknown again only a slight change, corre- 
sponding to the loss of one degree of freedom in the chi square distribution, is 
necessary. 

The situation encountered in this last subsection does not occur if the process 
is observed continuously but not exactly. This may be done in various ways, a 
suggestive one for estimating the variance being the following. The process is 
observed continuously but only deviations exceeding a prescribed size A are 
recorded; that is, we are given a sequence of real numbers 0 = tf < ft; < bb < 
having the property that | 2(é,4:) — x(t.) | 2 A while | x(t) — x(t.) | < A for 
<2 < bas oe @ @, 1, 2,°-*). 

6.6. One may also consider for continuous (in time) processes such problems 
as those of sequential unbiased estimation (see [6]) and of unbiased estimation 
at the conclusion of sequential hypothesis testing (see [7], [8]). For example, for 
the first of these, one can prove an analogue of the extension of the Cramér-Rao 
inequality proved in [6], where for our setup /n is replaced by ET and f(x, @) is 
replaced by the probability function or density function of z(1) in equation 
(4.5) of [6]. Under regularity conditions analogous to those of [6], and which are 
satisfied for the four processes considered herein, the proof (also valid for biased 


estimators) may be carried out by dividing the time axis into intervals of equal 
length and allowing the length to approach zero. In particular, the fixed duration 
estimator of duration T and with estimator 2(7), 7 is an unbiased estimator of 


\, 6, (1, p) — 1, and uw (for the cases considered in Sections 2, 3, 4, and 5, respec- 
tively) for which equality holds in this extended Cramér-Rao inequality. This 
inequality could also be used to apply the technique of [3] to our problems. The 
two problems described above will both be considered in detail in a future paper. 

Finally, we remark that many of Wald’s general results on decision functions 
(complete class theorems, etc.) carry over to the present case of continuous time 
processes under suitable assumptions. As in [1], the main difficulties in the general 
theory are ones of measurability, and we shall not bother with them here. We 
shall return to these problems in a future publication. 
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1. Summary. Let the random variable X = (X,, X2, -+-, Xn) have the 
probability density 


a det? 2 hor’ 


= (2x)"/? 


p(x) 


where zx’ is positive definite. The present article solves, by means of Laguer- 
rian expansions, the problem of finding the distribution of any nonnegative 
quadratic form XPX’. If the semimoments (defined below) are known, it also 
solves, by means of Laguerrian expansions the problems of finding the distribu- 
tion of any indefinite quadratic form, and the distribution of the ratio of any 
indefinite quadratic form to any nonnegative quadratic form. For an outline 
of the procedure, see Section 2. If the distribution of the indefinite form is sym- 
metric, the semimoments are easily found, but often, especially for the tech- 
nique described below for ratios, the semimoments are difficult to obtain. In 
view of this, a new system of orthogonal polynomials is proposed, which is 
analogous to the Laguerre system, but which obviates the need of semimoments. 


2. Introduction. There are various distribution problems associated with a 
quadratic form or a ratio of quadratic forms. Suppose P and Q are arbitrary sym- 
metric n XK n matrices of known constants. The notation (DQ); , (DR); , (DQ) i: 
(DR), (¢ = 1, 2) refers to the following distribution problems: 

(DQ), : Find the distribution of YQX’. 

(DQ). : Same as (DQ), , but with Q nonnegative. 

(DR), : Find the distribution of YQN’/X PX’, but with P nonnegative. 

(DR). : Same as (DR), but also with Q nonnegative. 

(DQ), (DQ)1 : Same as (DQ),, (DQ)s respectively, with the further re- 
striction that Q = J and Q is diagonal. (J denotes the unit n XK » 
matrix). 

(DR), (DR): Same as (DR),, (DR)s respectively, with the further re 
striction that 2 = J and both P and Q are diagonal. 

It should be noted that although it is always possible by a linear transforma- 
tion to reduce (DQ); to (DQ) , (¢ = 1, 2), the same is not true for reducing 
(DR); to (DR). In fact, a necessary and sufficient condition for the latter 
reduction is (ef. Weyl [11]) PGQ = QGP where G = TT’, and T is a matrix such 
that 7°OT = I. 


It is possible, however, to reduce (DR), to (DQ) by means of a linear trans- 


Received 9/16/52. 





DISTRIBUTION OF QUADRATIC FORMS 417 


formation. This will be shown in the following section. It is relevant to add that 
Robbins [7], and Pitman and Robbins [6] have solved, by means of certain con- 
vergent series, (DQ). and a certain subclass of (DR). . The present article differs 
mainly in two respects. First, the more general problems (DQ), and (DR), are 
considered here. Second, only Laguerrian expansions are used. It may be added 
here that the distribution of quadratic forms or ratios of quadratic forms for 
certain cases has been investigated by McCarthy [13], von Neumann [14], 
and Bhattacharyya [15]. It may be noted further that Bhattacharyya [16] and 
Hotelling [17] have employed Laguerrian expansions for the cases they consider. 

For convenience in applying the results obtained in this paper, the procedure 
will be outlined, as follows. 

I. (DQ). . Reduce to (DQ) , so that XQX’ = >> :¥{ where Y has density 
f(y) given by (2). By a change of scale arrange that 0 < 7; < 1. Let F(x) = 


P{XQX’ s x} and K(z) = / g(t) dt with g(t) assigned by (14). Compute 


the moments uy, = E(XQX’)*. The Laguerrian expansion (convergent, here) of 
F(x) — K(z2) is given by (10), where 

(i) LS” (x) is defined by (4) or (5), 

(ii) AX? = ava? = Do (-1)" (" . “) &. 

o=0 n—v/jvy! 

II. (DQ), . Reduce to (DQ) so that XQX’ = DOM y. Yi — DOME yi 
where 0 < y; < 1 (arranged by change of scale) and Y has density f(y) given by 
(2). K(x) and F(x) have the same meaning as in I. Compute the semimoments 


5, = / vee [PS ra] dy 


° ° ° ° . + 2 ° 

where the region of integration is given by }of'*"? y:yz = 0. If the density of 
XQX’ is symmetric, its characteristic function may be used to find the 6,’s, 
without requiring explicit knowledge of the y,’s. The Laguerrian expansion 


(convergent here) of F(x) — K(x) for x > 0 is given by (10), with 


A a gt — E : *) i Of 


For x = 0, and x < 0, see Section 7. 
III. (DR), . Reduce to (DQ) by method of Section 3, so that 


‘xex’ . } < 2 | 
, = = > oe <= ‘ — 2 Z < 
G(z) =1 \xpy ©? I {2 (2) ¥i s or 
where Y has density f(y) given by (2). By a change of scale, it is possible to 
write | \,(z) | < 1. The method then follows that of II., replacing y; there by 
A(z). 


3. Reduction of the (DR); problem. For any real z, let R, = Q — 2P and 
G(z) = P{(XQX’)/(XPX’) S 2}. Setting Hg) = P{XR.X’ Ss &} it follows 
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(recalling that xP’ is nonnegative) that H,(0) = G(z). Let A,(z) (¢ = 1,2, --- ,n) 
be the roots of det (R, — \Q) = O and A, be a diagonal matrix with these roots 
as the diagonal elements. The roots \,(z) will all be real, and there exists a 
matrix 7 such that (ef. Bécher [1]) T’27 = I and T’R,T = A,. 

Hence 


(1) H.({é) = P{XA,X’ & §} 


where X has the probability density 


® —tcr’ 
(2) f(x) = (2.¥* . 


oT 


Now XA_X’ is a linear combination of independent random variables each of 
which is distributed as x° with one degree of freedom. The (DR), problem of 
determining G(z) thus reduces to the special (DQ), problem of finding H,(0), 
that is, the distribution of this linear combination at the single point — = 0. Geo- 
metrically, this may be interpreted as the problem of finding the probability 
measure of the interior of the cone xA.2’ = 0 when the probability density of X 
is given by (2). 

In anticipation of Section 7 it should be remarked here that in determining the 
semimoments of X A,X’ it is sometimes not necessary to find the roots \,(z). 
For, by the uniqueness of the characteristic function 


’ ~} 
det (Q — 2itR.) = it (1 —- 2in(0 | 
j=1 


the moments are found upon evaluating the derivatives at ¢ = 0, and dividing 
by the appropriate power of the imaginary number 7. If, for instance, the density 
of XA.X’ is symmetric, the semimoments are just half the corresponding 
moments. 

It is also of some interest to see how the reduction of the (DR), problem is 
accomplished to yield directly the probability density of the ratio. Of course, the 
probability density could be obtained by differentiation of the distribution 
function, but this might not be advisable or feasible, depending on the con- 
vergence properties of the approximation used in determining the distribution 
function. 

The following theorem will now be proved. This theorem applies generally, 
irrespective of quadratic forms, to any ratio (absolutely continuous random 
variable, positive denominator) and any probability density p(.). 

Turorem 1. Let X have probability density p(x), and define K and q(y) by 


yPy'ply) 


K = [ser x) dx, (y) = . 
A a uy Qi K 


Then, the probability density G'(z) 1s given by Kr.(O) where r.(&) is the probability 
density of the random variable YR.Y’ when q(y) is the density of Y. 
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Proor. From the theory of inversion formulae (cf. Gurland [4]) 


. 1 d¢(t, a) 
(z =, eueeaetsatnae 
we 2nt f | Ole bgt “ 


$(t,, tb) = E(e't!202' +6taxPx") 


‘a —e€ T 
and the notation f signifies lim (/ + / ). Now 
«e—0 —T € 


T-2 


| 22 = Ki I ef #86?" ofr) dx = Ki6,(ts) 
Ots tom—t 2 — a 


where @,(t) is the characteristic function of XR,X’ when X has the probability 
density q(x). Hence 


where 


G'(z) = x ¢ 0. dt = Kr.(0) 


by inversion of Fourier transforms. This completes the proof. 
Before applying some results of this section, we shall, at this point, state some 
theorems relating to Laguerrian series. 


4. Laguerrian series. By a Laguerrian series is meant an expansion of the 
form 
(3) f(x) ~D c&& LO (z) 
where 
i _ i. 
(4) Ly (z) = —éx* (+) (2°*“e*) a>-l. 
n! dx 
The sign of equivalence in (3) indicates the coefficients c)"’ are determined by 
cy __  Tin+1) 
. Tin +at1) 4 


in view of the orthogonality realtions 


et" LS (f(t) dt 


- ( Q, ms~n 
I CULO OLE dt = Spr + a +1) 
| Tin + 1) 
From (4) it follows that 
(5) L (2) = > (ats *) ae. 
oo \n — Vv ly 


Before stating the following theorem, the notion of equiconvergence of series 
will be recalled. If the series Zo (u, — Av,) is convergent, where A is a non- 
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zero constant, then the series >>? wa, >.o Un are said to be equiconvergent. 
It is also convenient to recall the definitions of O(g(x)) and o(g(x)) 


f(x) = O(g(x), x — 2 means (f(x))/(g(x)) remains bounded as x — 2 
f(x) = O(g(x)), x — x, means (f(x))/(g(x)) ~ 0 asx — x. 
THEOREM 2. (Szegé [9]). Let f(x) be Legesgue measurable, 0 S x < &, and 
let the integrals 
pl 


1 
" x“ | f(x) | dx, [ pt l2ala | f(a) | dx 
? ~ 


d 


exist. If the condition 
(7) [et ct" | $02) | de = ofr) 


is satisfied, and if s,(x) denotes the nth partial sum of the Laguerre series (3), we 
have, forx > 0 


2h+5 


; —— ) 
(8) im Je4(0) — 2 ff ste) PE = ah = 0 


nao i-5 xai-—r 


where 6 is a fixed positive number, 6 < x’. This holds uniformly for every fixed 
positive intervale S$ x S w, 6 < a 


The same equiconvergence theorem (8) is valid if the intergals (6) exist and 
(7) is replaced by the following: 


°o 
| g wit,et2—3'4 | f(x) | dx 
1 


is convergent, and 


[ 4°" F(x)” dx = o(n™*”*), 


The integral occurring in (8) is essentially the partial sum of order [n’] of a 
Fourier series, where as usual [n*] denotes the largest integer <n’. A sufficient 
condition for the validity of (8) is 


f(z) = O°"? 2-0"), 6>0, r> &. 


Before quoting a second theorem of Szegé, which ensures summability of (3) 
at x = 0, we recall the definition of Cesaro summability. Let s, denote the nth 
partial sum >su,. The series >-?u, is said to be (C, k) summable (ef. 
Zygmund [12]), k > —1, to the sum s if lim,..80”/C“ = s where 


c® = (n+k)(n+k—-1)--- (k+ 1) 


n! 


n 
(k) (k—-1) 
Sn = Zz. ane 8, = > "hy a= 


r=() r=(0 
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THEOREM 3. (Szegé [9]). Let f(z) be measurable 0 S z < ©, and continuous 
at x = 0. If we assume the existence of the integral 


(9) | ett "8 #2) | de 
1 


the Laguerrian series (3) is summable (C, k) at = 0 to the sum f(0), provided 
k > a + 3. This statement is not true fork S a + 3. 
The condition regarding (9) is satisfied if 


f(z) = oO" 2° **), §>0, tr». 


It should be remarked that for the case x = 0, the kth Cesaro mean has the 
simplified form 


2 


(CP Ta + 1} I etf(Q Lt (t) dt. 
0 


5. Laguerrian expansions for distribution functions. Let a random variable 
have the distribution function F(z) = | p(t) dt. By analogy with Gram- 


Charlier series, we may consider 


p(x) ~ &*x* ¢ an Ly (zx) 
(10) ey 
F(z) — K(x) ~ e&*z* DS AML (2) 


n=O 


where 


a” - | p(t)L‘(t) dt 
0 
(11) 


AS = [ FO - KOILW at 
0 
and A(z) is a conveniently chosen distribution function 
(12) K(z) = [ g(a. 


Note that aS“, A‘ are linear functions of the “moments” taken over the 
interval (0, «) and not (— x, «). We shall call such moments ‘‘semimoments.” 

It is in order at this point to remark why Laguerrian rather than Gram- 
Charlier series are being considered here for the aforementioned distribution 


problems. The main reason is that Cramér’s condition ((3], p. 233) [ e”/4 


dF (x) < sufficient for convergence, is not satisfied for these problems; and the 
theorems which guarantee Cesaro or Abel summability (cf. Szegé [9], Hille 
[5}]) do not relax Cramér’s condition very much if at all. 
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In this paper, expansions of the type (10) will be considered. In order to 


simplify the formula (11) for 4S*’, it necessary to refer to the following lemma. 


Lemma 1. If all the absolute moments corresponding to F(x) and A(z) are 
finite, then 


F(z) — K(x) = o(x™’), z—+0 
for allr > 0. 


Proor’. Let M, = | |¢\|" p(t) dt. Then 


M, = | t"p(t) dt = x | p(t) dt. 
z z 


| p(t) dt = O(z~"), 


Similarly for / g(t) dt. By similar reasoning it can also be shown that 


[ p(t) dt = O(x"), [ g(t) dt = O(z") 


Since we can write 


F(z) — K(z)| =|1—K(z) — (1 — F@)) | g(t) dt + / p(t) dt 


| F(z) — K(z) | [ r(t) at | OE 


the required result follows. 
To apply this lemma, let V7," (x) be such a polynomial that 
< MS“ (x) = LS” (x) 


dx” 


and, for convenience, let M°‘*’(0) = 0. Since 


4 15%) = —L©(2), 


dx 


as can be seen from (5), it follows that 


al (a=—1)/,.\ __ n +a 
M(x) = | us (x) (" . ni. 


1 The author is grateful to Morton Slater for his assistance in greatly simplifying the 
above proof. 
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Integrating by parts the expression for A‘S*’ and applying Lemma 1, A;”’ 
becomes 


AP =—[ MOWwO — gO) at 
(13) ‘ 


(a-— a—1l) + * 
= ats? — ps5? — Ee : a [ po - goiae 


where 

L‘*’ (t)p() dt, =] LY (Wg(d dt. 
“0 
By choosing 

(14) l(a + 1)" 
= 0, 


the expression (13) simplifies to 


5) Ae = asi” -("F4) [peo - golae 
0 


n+ 1 


If, further, p(t) = Ofor? < 0 (as, for instance, in the (DQ). problem), the formula 
becomes 
(16) A,’ = dines 

6. Solution to the (DQ), problem. Before proceeding to the solution, it is 
necessary to establish the following lemma. 

Lemma 2.” Let X have the probability density f(x) gwen by (2). Define 
U = > y:Xi where the constants y; satisfy 0 < y; < 1. Denote the probability 
density of U by pr(u). Then 


(n/2)—1 —(u/2) (1+e") (n/2) —(u/2) (1+) 
u é é€ 


r(x] <pr(u) s rl at 


where 


l4v eum! . t+eoun:. 
é Yi & 


2? The author is grateful to Ray Mickey for his assistance in greatly simplifying the 
formulation and proof of this lemma 
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Proor. Apply to (2) the transformation 


~ = Vi vi 
Then 


1 
pz(z) = =r exp | - 
[Ix] en 


and 


he 


ucEaicutau | I 1 


exp |- 5 (1 + e’) . “| 
“a de 
vs] (V3R)” 


[J pz(z) dz 


n 
u<zer< u+Au 
1 


exp | - ste de ‘| 
ef-{ 4 — 
u<Zei< utAs | | (\/2n) 


By applying the mean value theorem of integral calculus, and letting u — 0, 
the required result follows. 

To apply this lemma, suppose (DQ). has been reduced to (DQ). . Let XQX’ = 
>I viX¢, where 0 < y; < 1 and X has the probability density (2). Hence, by 
Lemma 2, 


dz. 


I Pu(u) du ss errs, 


Now, with A(x) defined by (12) and (14), it follows that 
1 — K(x) = O(e7x*"), 
Thus, 
(F(x) — K(zx)] &a~* = O (max {e779 x"? 7}), rw, 


Theorem 2 is now applicable to establish the convergence of the expansion (10), 
with AS” given by (16). By Theorem 2, the series for F(z) — K(x) will con- 
verge at each point x if the Fourier series converges there. Since F(x) — K(x) 
is of bounded variation, convergence is assured by Jordan’s test (cf. Titchmarsh 
[10}). 


7. Solutions to (DQ), and (DR), if the semimoments are known. As re- 
marked in Section 1, there may be instances where the semimoments are easily 
found, as in the case of an indefinite quadratic form with a symmetric proba- 
bility density. Before applying the convergence theorems of Section 4 it is 
necessary to establish the following lemma. 
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Lemma 3. Let X have the probability density (2) and define 


ni ni+ne 
r 72 r 72 
Ui= Livi, U; = ) yiX; 


ni+l 


where 0 < ¥; < 1,m + m = n, (i = 1, 2, --- , n). Denote by pyr(v) the proba- 
bility density of V = U, — U:. Then 


(17) [ priv) dv = O(e™ FS! FO 21/2) | 


(18) [ pr(v) dv = O(g tact |x — 


where 1 + ¢€ = min; y;°. 
Proor. By Lemma 2 


1+ (n,/2)—1 /2)—1 
Poy.va (ts Ua) S Co Maroy inriD—tyjsnal® 


where 


Hence 


] 
y =} )(1+ ( 2)—1 /2)—1 
[ pvt) dos ff cerbortenatoyionia—tyses!—4 dy dg 
z Ul— ugeez 


= ( f [ ¢ TA+d Busted (,, + us)” ?ys?!? dus dv. 
v=z 4 ug=0 


/2 /2 Ue eal n,/2 n1/2 
(v + uw)" = v" (1+ ) <v"*(1 + u,)"!* 


v 


since v > 1 (because x — ~ in (17)). The validity of (17) follows, since 
[ g tenn, + us)” Pus?!? duz < o, 
0 

Also, since 


[ pr(v) dv = I P Puy.u3( , U2) du; duz, 
oo uj—ugs—z 


rr I Pv;,0 (th » U2) du; due , > @ 
ug- uj 22 


the result (18) is established by an argument similar to that for (17). 
In applying the result of Lemma 3 to (DQ), , it may be assumed XQX’ is in 
the form U; — U2 (as can be effected by a linear transformation). If, also, K(x) 
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is defined by means of (14), then { F(z) — K(a)}e’2 * will satisfy the conditions 
of Theorems 2 and 3. Consequently, the expansion (10), with A{*’ given by 
(15) will converge (See the last remark of Section 6) for x > 0, while for x = 0 
it will be (C, 1) summable if @ is chosen to be zero in Theorem 3. 

For « < 0, the result of Theorem 2 applies by considering the expansion of 
F(—z) — K(—z). 

Lemma 3 may also be applied in solving the (DR), problem by using the 
reduction of Section 3, and employing the same type of argument as for the 
(DQ), problem avove, to show that Theorem 3 ensures the (C, 1) summability 
of the Laguerrian expansion at « = 0. 


8. Proposed system of polynomials for the general solution of (DQ), , (DR);. 
As mentioned above, the semimoments are often difficult to obtain. The con- 
vergence properties of Laguerrian expansions are most convenient, but the 
main shortcoming is that the weight function is zero over the range (— ~, 0). 
What is required is a nonzero weight function over (—*, «&) which would 
generate a system of orthogonal polynomials behaving asymptotically in a 
manner similar to the Laguerrian system. In such a case, ordinary moments 
rather than semimoments would be used in the determination of the coefficients 
of the expansion, and, these ordinary moments can be found without difficulty. 
An orthogonal polynominal system which seems to suggest itself naturally is 
that generated, according to the Gram-Schmidt process (ef. Courant-Hilbert 
[2]), by means of the weight function 

wir) = & || a2 {%, —xo <r < w, 

Shohat [8] has shown that for weight functions similar to this, the resulting 
system of polynomials is complete, but there appears to be no treatment of the 
convergence properties of such a system in the literature on orthogonal poly- 
nomials. If, as conjectured, this system behaves similarly to the Laguerrian 
system, then a much larger class of distribution functions will be expansible in 
convergent (or summable) series than the class to which Gram-Charlier series 
apply. 
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SCALING AND ERROR ANALYSIS FOR MATRIX INVERSION 
BY PARTITIONING’ 
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1. Summary and introduction. There is presently available a large number of 
techniques purporting to accomplish the inversion of matrices. While the purely 
mathematical aspects of this problem, on one hand, are thus well recognized, 
the computational ones, on the other hand, are not. The growth of the rounding 
error, in particular, may be so rapid as to make some inversion procedures 
altogether unstable. 

It is from this point of view that the partitioning method seems to be capable 
of yielding more accurate results than do other methods. By stopping, at any 
desired step, to improve the intermediate inverses until satisfactory accuracy 
is attained, the growth of the rounding error may be kept in check. 

The following sections, then, give a brief description of the partitioning method 
and treat in some detail an effective scaling scheme permitting the inversion 
routine to be carried out by high speed computing machinery. 

Next a careful examination is carried out of the accuracy attainable by the 
proposed scheme; together with an error squaring iteration procedure it is 
found capable of yielding accuracies sufficient for most practical purposes. 


2. Method of partitioning. The method of submatrices has been described 
and discussed in great detail in a number of places, as, for example, Frazer, 
Duncan, and Collar [1]. It is shown there that the inversion of a nonsingular 
square matrix A of n dimensions may be accomplished as follows. Let 


A, = (aij), 4,j, = 1,2, ---,&; k=1,2,---,m 


denote the sequence of successive principal submatrices of A, and let Ags: 
be partitioned in the form: 


1 k+1 

A, ay | e | , 

Ans = ‘ ’ Qa, = : 9 a, = (ox41,1 ae Oic4-1,k)« 
Qe yi, k41 la | 

ke k++] 


° = o4e te 
Then the inverse A,,; may be partitioned similarly. 


aa C “ 
A +H = ’ ’ 
Ce Yk 


Received 3/15/52, revised 12/24/52. 
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and the components of Az, computed by the algorithm: 


wi , | 
X= —A; Ak; y= —-a A; 9 


' nt 
Oe = Oesiets + One, mKe=&, 


Ce = Tie, Ch = Vet, 
C. = Ay’ + ry. 
As a byproduct one also obtains 
det Agi: = 5 det Az. 


While the nonsingularity of A guarantees the existence of A™ it is possible 
that some principal minors of A vanish. In such a case rearrangement of rows 
of A will remedy the situation. 

Certain simplifications result if A is symmetric. Then a, = a; where the 
asterisk denotes transposition, 7, = zp and c, = cp . While the partitioning 
method is applicable to (nonsingular) matrices in general, we shall, in the 
following, restrict ourselves to positive definite ones, that is, matrices A that 
are symmetric and for which the quadratic form z*Az is positive for any 
vector x ~ 0. This does not constitute too serious a restrict.on since for any 
nonsingular matrix A the matrix A*.A is positive definite, and A~* = (A*A)A*. 

Well known properties of nonsingular positive definite matrices A that will 
be utilized are: 

i) all diagonal elements are positive; 

ii) max | a,; | is assumed by an element on the main diagonal; 

iii) all principal submatrices A; of A are positive definite; 

iv) the inverse of a positive definite matrix is also positive definite. 

For positive definite matrices, then, y, > 0, so that also & = (1/7) > 0. 
Now & = cegieat + Gets = Oeaieg: — af Aza, . Since Aj’ is also positive 
definite, af Ay'a, > 0. It follows that 


¢ * 
(2.2) A Le < Ak41,441 » 0 < b& < Ges eu- 


In studying the efficiency of a method, especially if treatment on high speed 
computing machinery is anticipated, it is of importance to know the number 
of arithmetical operations involved in the method proposed. 

For symmetric matrices of order n a count of the operations reveals that the 
method of partitioning, as described above requires 3(n — 1)n(n + 2) multi- 
plications, 3(n — 1)n(n + 1) additions or subtractions, and n divisions. Similar 
counts carried out for the Gauss Elimination Method, as outlined by von Neu- 
mann and Goldstein, [2], reveal that in the symmetric case the totals for multi- 
plications and additions are identical with the above. Since other variations of 
the elimination methcd take substantially the same number of operations, or 
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more, it is clear that the basic partitioning method is neither favored or dis- 
favored by virtue of operations alone. 


3. Modulus of a Matrix. The measure of the magnitudes of the quantities in 
the inversion process will ordinarily be the modulus, denoted by |! _ || and defined 
as the greatest absolute value of any of the entries, that is, 

| (aij) || = max |ajj|. 
t,Jml,-++,n 
In some cases improvement would result if the norm or bound, (see [2]), were to 
be used, but the modulus is the simplest and most easily used in the situations 
discussed here. The following well known relationships will be used in the dis- 
cussion of the rounding error: 
|A+ B| 


} 


lo A 


AB | 1.1 
| A” nA ll”, p2 il. 


4. The scaling problem. In putting the inversion problem on a machine 
that is capable only of operating on numbers restricted to a finite interval, care 
must be taken to insure that all quantities occurring in the course of the in- 
version procedure actually lie in the prescribed interval. This can be achieved 
by means of appropriate scaling. 

It seems advantageous to carry out the necessary scaling operations by “‘iter- 
ated halving,” that is, successive divisions by 2; further, it will be assumed that 
the scaling produces numbers restricted to the interval (—1, +1). However, for 
a scaling scheme to be efficient it is not sufficient that it merely produce numbers 
of absolute value not exceeding unity; clearly excessive reduction in magnitude 
will adversely affect the accuracy of the numbers. 

It is claimed that the scaling scheme to be described in the following sections is 
very efficient. 

To be introduced into the machine, a preliminary scaling of the given matrix 
A’ is called for; if t(¢ — 1) Ss |! A’ |! < t(c), where we write t(c) = 2°, then the 
matrix A = (a;;) = t(— A’ satisfies 2" < || A || < 1. 

In those cases where « < 0 this leads to upscaling, that is, enlargement of 
absolute values; this is quite permissible. 

Starting with A; = (ay) we first “standardize” ay : 

ay = apt(1 — ry), 2* Saw <1. 
Then the choice of the scale factor {(—X,) will insure that B, = 
Ay t(—d,) satisfies 2’ < | B,| < 1. 

Let us suppose now that at the start of the Ath stage, 1 Sk S 

know £B, and X,. so that 


(4.1) B, = (89) = 4; U(-m), 2 S €i. 
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The kth stage of the inversion process then consists in the calculation of the 
properly scaled quantities defined in (2.1). Deleting the subscripts /, and putting 
+1441 = B, these equations become: 


= —A“a, 
(4.2) = az, c = 27, 
B+, R xe*, 
y = 5", C= A'+R. 


Further, it will also be convenient to use the abbreviations a;.4; = aj, Ary = 


M, so that 
: = , c <« 
M + i M = ‘ 
er vi 


ay 


The process to be discussed in the following then consists in the determination 
of a matrix D and an exponent w such that, corresponding to (4.1) 


D=M"t(-w), 2's ||DI| <1. 


The first step of this process calls for the formation of a suitably scaled z. 
An examination of the relationships (4.2) reveals that the accurate determination 
of x is of paramount importance, so that it should be done as carefully as possible. 

In forming & = z ; Bs; a; it is,observed that partial sums may exceed unity. 
This can be remedied by computing instead 


i 
ot: ni = Lo Biyayt(—p) 


oe 


where 
(4.3) tp — 1) <k S t(p). 


Since | 6;;| <1,|a;\ < 1, clearly O S | y || < 1. (From the positive definite 
character of A and the fact that the quantities m,;, 6; are digital numbers, it 
follows that alsoO < |! 7 S 1, where as in Section 5, 7 is the computed approxi- 
mation to y.) 
However, if this is done, double precision accumulation, as described in 
Section 5, is imperative if sufficiently accurate values are to be obtained. 
Next we find o as the greatest integer for which 


(4.4) (3) y'l+ to) <1 


(ii) ofr+op 


and put 7 = yt(o). 
In practice ¢ may be determined as follows. 
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First, if x 


0, then y = 0, ando = 
problem is trivial, since also 


0 is satisfactory. In this case the scaling 


- a 
c= R=0,anM = ; 

0 ¥ 
n(—-x), 2" S 9 <1 


The following discussion may then be restricted to x * 0. We put || y || = 
Z . Then if x S A H+ op, 
case 2' < || Zz! 


-= 
s 


take o x. In this 
% || < 1. If, however x > A + p take o \ + p. Then || Z || 
nt(—x ++ p), whence t (—1 —x +A +p) S ||}%!| < 27. 


Obviously 


t= 


— Bat(—p + oa) 


=a2(—-u), w=rA\+p-—-e. 


By (i), 0 < || || < 1, and by (ii) wp 2 O. We note that if x < > + p, then 
u> 0;ifx SA+4+ p, theny 


= 0. 


— ~ i = al e 
For the minimum value 0 of 4, = —A~a, so that Z is never scaled higher 
than the vector it represents. This is reasonable, since no gain in accuracy 
accrues from this type of overscaling. 


Having determined ¢ we compute next 


v= 


a*it(—p), §¢ = (po + u) = a*Z. 
The quantity # should also be computed by double precision. 
scaled. 


If we now recall that O S — ¢ < B < 1, we see that ¢ is already properly 
This is also true of 6 = 8 + ¢. 


The formation of 5’ again requires scaling. Let 5 be standardized: 


6é=ul—»v), 2’ Si<l. 
Then « = t(—yv)/6 = yt(—v) will be subject to 27 < « < 1. 
Continuing, we find that the quantity s 
Z ile <1. 
Also, s 


ix is properly restricted: || s || = 
= ¢cl(—y — »). 
Similarly, if 7 = Zs*, then || JT |) = 


Ez 


|| s || < l,and 7 = Rt(—2yp — »). 
The remaining part of the computation necessitates one more scale factor. 
Its determination is facilitated by the following facts. 


1. Since y < 0, the diagonal entries of the matrix R = xrx*y are positive; 
R | is assumed on the diagonal. 


2. C = A’ + R is positive definite. Thus 

Cis ATI] + RI, HATH Sicll, IR|i sic. 

Consequently, if || A~’ || = || R ||, then || A™’ || S$ || C || < 2|| A7 | ; if, how- 

ever, || A || S$ || KR, then || R|! <||C |] $ 2'|R|]. In both cases, then, 
max (|| A’ ||, || R ||) S || Cl] S$ 2 max (|| A“ |], || RI] ). 
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Since A™’ Bt(d), R = Tt (2 w + »v), it follows that 
| C || S 2 max (|| B || ¢Q), || 7 || tu + »)). 


Putting ¢ = max (A, 2u + v) + 1, we may thus conclude that || C || t(—¢) < 1. 
However, the adoption of the scale factor t(—¢) for all cases may entail some 
overscaling, as we shall see presently. 
The exponent y = @ —1 certainly suffices to restrict 


U = B/t —») = A“t(-y), 
V = T/t(y — Qu — v) = Rt(-y), 


properly: || U || < 1, || V || < 1. 

But in the formation of W = U + V = Ct(—y) capacity may be exceeded. 
To provide for this possibility we put Z = W/t(r) = Ct(—y — r), r = Oor 1, 
and are then assured of || Z || < 1. The actual value of x is best determined by 
computing the diagonal elements of Z first. 

Let us see now how the choice of t(—w) as scale factor for C affects the scaling 
of the total inverse M~’. Clearly || M~™ || = max (|! C ||, y), or, if we introduce 
F = C''t(—y), then || F || = max (|| W ||, «t(v — y)) < 1. While it is thus 
certain that F is of modulus less than unity, the scaling exponent y raay make this 
modulus unnecessarily small. To recognize this fact we consider these distinct 
cases: 

lA S 2+». Herey = d, and||F il >| Wiz] Ul] =| Bill = 2°. 
Thus y is the correct exponent for C™’, and x = 0. 

2%4<2u+yv,yn = 0. Nowy = v,and || F || >« = 2”. Here y again is the 
correct exponent, and x = 0. 

3.\ < 24 + v, un > O. In this case Y = 2u + v. Also it was established pre- 
viously—below (4.3)—that for » > 0, ||%{| = 2”. Thus || F || = || W || = 

V | ='| T!| = | Z|!’ = 27°. The exponent y may then be low by a factor as 
large as 2°. , 

For all three cases we may consequently put D = Ft(v), v = 0, 1, 2 in order 
to guarantee 2 S | D|| < 1. 

Denoting the total scaling exponent of M~ by w, we have 


(4.5) D = M” t(—w), w=yYytar--v. 


There only remains the proper alignment of the parts C, c, y of M7: 
C Z t(v) Ct(—w), 
(4.6) 7 = xt(@w — ») yt(—w), 
=s/t(w~—pw— v) = ct(—w). 


In summary, the proposed scaling scheme is to be used as follows. 
Start with ay, = aot (1 _ Ai), 7 Sa < . put B, = ait t(—),). At the start 
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of the kth stage, 1 < k <n — 1, B, = B,\ = \are known, and B = A7't(—)), 
2°' < || B|! < 1. To obtain a scaled representative D of M~', where M = 


Aa 
, proceed thus. 
se” £ 


1. Determine p so that t(p — 1) < k S t(p), and put y = —Bat (—p). Then 
calculate o as the greatest integer for which (i) || y || t(¢) < 1, (ii)o SA + p. 
Finally, put = yt(c), and define u» = A + p — o. 

2. Compute f = a*il(—p),c = H(p + u),6 =8+¢.1f6 = ull —»v),2° Ss 
«<i, let« = «&(—»)/é. 

3. Puts = ix, T = is*. 

4. Introduce y = max (A, 24 + »v), and form U = B/t(y — A), V = 
T/t(y — 24 — v), W = U + V. Compute the diagonal elements of W first, 
thereby determining the exponent x = 0,1 in Z = W/t(x). 

5. Follow with C = Zi(v), finding v = 0, 1, 2 so that 2” < || @ || < 1. 

6. With w = ¥ + zm — » align x, s by calculating 7 = 
s/t(w — wp — v). 


Then D = c . =M~'t(—w), 27 s ||D|| < 1. 
a F 


k/Uw — v), &é= 


From E, = A,d(e) it follows that Ey’ = Az't(—.), and Egi, = Dt (—e + wo). 
Further, 


(4.7) det KE, 4 = 6, t(c) det Ex . det E, = jl. 


5. Digital operations and basic estimates. In the discussion of the inversion 
problem no mention was made of the fact that in translating the procedure from 
theory to practice certain errors are unavoidable. They stem from the necessity 
of having to replace mathematical operations on exact numbers by digital 
numbers. A detailed discussion of the nature of digital numbers and operations 
may be found in the paper of J. von Neumann and H. Goldstine [2]. 

Adopting some of the notions and relationships employed there, we define 


? 
a “digital number” 7 as an aggregate 7 = sgn (y) Zz. a‘, with 6, the base, 
t=1 


0, 1,2,---,8 — 1, and sgn (y), the sign, being +1. A digital number necessarily 
lies in the interval (—1, +1). 

The “digital” operations of addition (+), and subtraction (—), have their 
usual meaning. Digital multiplication (X) and digital division (+), however, 
lead to numbers generally having more than p digits. The product of two digital 
numbers ¥ and 6 has 2p places, and will be denoted by 7 X X 6 (double precision 
multiplication); if it is desired to keep only the p more significant places, a 
(rounded) product 7 X 6 is obtained. The rounding, if necessary, will be as- 
sumed to be of the ordinary type, that is, the product is truncated after p + | 
places, 8/2 units are added to the (p + 1)st place, the possible carries thus 
produced are effected, and then the first places only are kept. In this procedure 
the absolute value of the rounding error cannot exceed « = 8 °/2. 


denoting an even positive integer 22, a;, the digits, assuming the values 
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The basic inequalities relating digital to true operations are: 
(5.1) ly X6—75| Se, 
|\y +5 —7/5| Se. 
The digital operation of “iterated halving” satisfies 
| 


7+2- €, 


/sy 
| + 2° — 4/2*| < 2. 


Further, if | 7 — 6! < 2ke then 
|¥ + 2% — §/2*| < 2e(1 + k/2%). 


Next, let € = (y;) be a digital row vector of k components, d = (6,) a similar 
column vector. Then for their digital inner product 


(5.2a) 


we have 


However, if double precision ¢ XX d = >-*. (7; XX 5) is employed, then 
(5.2b) \¢xXxd—éd| <«. 


Finally, let A, B be digital matrices of common dimension k. Then clearly, 
by (5.2), | A x B — AB|! < ke, || AXx B-— AB! < «. 

Inequalities (5.2) also furnish estimates for triple products of k-dimensional 
digital matrices: 


'd x (Bx ©) — ABC! C) — A(B x €) || 
— ABC 


A ||-|| Bx ¢ — BC|| 


IA WA + IA 


\Ax (Bx C) — ABC) Ss k1 +k) A Ie. 
Double precision leads to 
'A xXx (Bx~x C) — ABC | < A+ kA Nye. 


The rounding error due to the enforced discrepancy between true quantities 
and their digital representatives needs separate discussion. If c, d are the true 
vectors whose digital representatives are @, d, then we define the errors 

é—c|| = U., ld —d|| = Ua, 
and note that 


(53a) ||¢xd—cdl| S ke t+ Us ,|%| + Ue Do /5;| + kULUa. 
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This may be recognized as follows: 
|exd—ed|| S$ \|exd—ad|| + || ed — ed || + || ed — od | 
ke + || ed — d) || + || @ — od || 
ke + Usd %| + Ue Ls | |. 


However, | 5;| S |6;| + Ua, i = 1, 2, --- , k, whence (5.3a) follows immedi- 
ately. 
Double precision improves (5.3a) to 


(53b) |@xxd—ed| Se+UsLlH|+ Ue L | |+kUUe. 


Of interest is also the matrix @ X d of k’ elements. Since each element of dc 
is the result of a single multiplication, 

(5.4) |\dxXé—de|| Se+ Uallé|| + U.||d\| + UUs. 

6. The digital procedure. After outlining the scaled partitioning method we 
must now consider the translation of the exact mathematical technique into a 
mechanical technique of digital operations on digital numbers. Let it be sup- 
posed, then, that A has been digitalized: A = A. Starting with A; = (ay), 
we express Gy in the form &, = dof(1 — d,), 2” S & < 1, and compute B, = 
&(—i) + Gn. 

Suppose now, inductively, that B is a digital approximation to B. Then 
A-i = Bt(d) is a digital approximation to A~. 

1. Find p satisfying (4.3), accumulate 7 = —B XX 4Gt(—p), determine ¢ 
as the greatest integer not greater than \ + p such that || 9 || t(¢) < 1, form 
¥ = gt(c) using double precision, and then round off. 

2. Compute } = a* XX Zt(—p), § = dt(p + uw), and suppose that ¢ is suffi- 
cently accurate for the inequality, 0 < —& < B < 1, to be preserved. Next, 
obtain 6 = B + &, standardize 6:5 = it(1 — v), 2° S i < 1, and get & = 
t(—v) + 6. 

3. Determine § = @ X x, T= %X ®. 

4. Form U = B + ty — 2d), V = T + tty — Qu — v), W = UC + *‘V. Compute 
the diagonal elements of W first, and take = 0 or 1 so that all elements of 
Z = W ~ t(x) are less than unity. 

5. Find »y = 0, 1, 2 to get C = Zit(v) into the range 2" < || C' || < 1. 

6. Adjust the other parts x, 5: 


7 =k + tw — »), @é=5 + tw —yp-— v), 


C 


c* 


to obtain in D = | 


| a digital approximation to D. 


7. Bound for the rounding error. It has been pointed out that the total round- 
ing error of any quantity 6 = f(7:, 7%, ---) stems from two sources: the round- 
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ing error due to the digital representations 71, 72, --- of the numbers 7, 
v2, -** and the rounding error due to the replacement of true arithmetical 
operations occurring in the function f by digital ones. It is the purpose of this 
section to examine these errors in order to arrive at estimates permitting an 
evaluation of the final accuracy obtainable. 

The analysis of these errors leads to the following theorem. 

Tueorem. If the scaled digital inverse B of A~ is afflicted with an error E = 
|| B — B || , then the scaled digital inverse D of M~ has an error E = || D — D\\, 
which is subject to 


(7.1) E < [23 + 9 vt(v)Je + 9/8[2vt(v) + 1)t(A — w)E 


with v = max (1, rt(8)), r = >; la;|. 

The error may thus accumulate from stage to stage, requiring iteration to 
keep it within reasonable bounds. In that the method of partitioning easily 
permits such iteration whenever necessary lies one of the advantages of this 
method. The bonds (7.1) are readily computed as the inversion proceeds; they 
are frequently sufficiently close to be of practical utility. 

Let us now prove the theorem. The inversion starts with the computation of 
B, = i(—),) + Qu. Thus, 


E, = || B, — By || = || (—d)/au — (—M) + Gn || Se 


The first quantity computed at the kth stage is £; its largest error will be E, = 
|| % — £||. From 


t—Z= Bx~x at(\ — v) — Bat(d — pz) 


B xx at(\ — uw) — Bat(d — ») + (B — B)at(s — pw) 


we infer E, < « + rEt(A — uw). Next we estimate E; = | ¢ — [| . Since 
§ — & = azt(u) — 4 XX Stu) 
dit(u) — @ XX Ht(u) + at(u)(z — 2), 


clearly E; S ¢€ + rEzt(y). 

Continuing with the estimates we remark, further, that for D to be positive 
definite it is necessary that 1/6 be positive. This, in turn, necessitates 8 + — > 0. 
Indeed the inequality 8 + — > E. may be used as a test for singularity of M 
relative to this process, in that if it fails, there could be a singular matrix M which 
would yield by this computation the same pivotal element Z, while if it holds, 
(and C exists) we may be sure of the nonsingularity of M. We observe in passing 
that E; is a function of the computation as well as the matrix and may be im- 
proved in case the test fails. 

Since 6 = 8B + §,5 = P -- &. we have 


E,=|6-—6|=|¢—-— | = &. 
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Further, E, = |% — «| = | t((—v)/é — t(—») + 3 | 
E, S t(—v) | 1/6 — 1/6| + | t(—»)/6 — t(—») + 3} 
t(—v) | 6 ; 
(E3/6)/(t(v)5) + «. 


However, ¢(v)§ = 27 = 1,so that E, S ¢ + yE; . Proceeding in the same manner, 
we find 


E, = ||s — 8} = || —2X «|| 
la—Zii« + ll@\| |x -—k| + |e —Z xX] 
Ex + E,||z|| + . 


But || Z|} s £.+ ||z|| < EB, +1,« < 1. Thus EZ, < e+ E+ (£,+ 1)E:. 
Employing the same technique for Er = || T — T || : 


Er, = || Zs* — = X 3 || 
<e+E,+ (E. + 1E:. 
Next, we must bound E- = || € — €|| 
Ec S || M/t( —d) — M + t(w— A) || + || T/tw — 2p — ») — T + tw — Qe — »)|| 
S 2e + Et(Q\ — w) + Ert(Qu + v — wa). 


Finally, E, = | 7 Se+ E.t(v — w), 


—¥ | 
E. 


=l|@¢-—é@|/Se+ E,t(u t+ v — w). 


It is seen that the bounds for £, and /, contain second order terms of the 
form E,Fs , where EL, , Ez are bounded thus: 


EaSme+oaE; Esp S Bet+ BE. 
Consequently, 
E,Es Sm pie + (a1 Be + a2 Bi)eL + a2 Be ge. 


However, it would be too cumbersome to carry along expressions of this type, 
and so certain simplifications recommend themselves. 

Clearly, if any of the error moduli E,, E., etc., exceeded unity there would 
be no accuracy left at all. Thus the assumption EF, < 1 is certainly justified. 
We shall in the following make the supposition that all quantities may be com- 
puted to at least two accurate binary places, an assumption which certainly 
does not impose any undue restrictions on the discussion. 

In that case FE, < 2°, E < 2 r and 


(7.3) E,E3 <= 2° min (E,, Es). 
> 
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Making use of this inequality and putting h = y + yrt(u) we may summarize 
our results as follows: 
FE, Se+ tA — pk 
E; [1 + rt(wle + PUA)E 
E, S (1+ hle+yrtQ)E 
E, < [(25/8) + he + [(9/8)t(—u) + yr]rtQ)E 
Ey < [(21/4) + he + [(9/4)t(—u) + yr]rt(AVE 
Ec < {2 + [(21/4) + A]t(Qu + v — w)}e 
+ {1 + [(9/4)t(—y) + yrrjt(Qu + v)}t(A — o)E. 
Now in the three cases discussed in Section 4, preceding (4.5), the quantity 
2u + » — w never exceeds 2. Therefore, 
Eo < [23 + 4hle + {1 + [(9/4)t(—u)+yr]rt(2u + v)t(A — w)E. 
Similarly, EF, < [5 + 4hje + yr°t(-—A + 7 — w)E. 
Ee < [(27/2) + thle + [(9/8)t(—p) + yr}rtX + + v — w)E. 


Clearly the bound for F¢ is larger than that for 2. , which, in turn, is larger 
than that for FE, . It follows that E = max (E-, E,., F,) also does not exceed 
the bound for E-. 


From E, = |x — *| = | yl(—y)—k\| we may infer that: 


y Sk+ E_t(v) < (1 + 2%)t(v) = (9/8)t(y). 


Obviously 27' (1 + 7t(u)) < max (1, rt(u)) = v. Thus, h = y(i + rt(u)) < 
(9 ‘4)ut(y), and 


E < [23 + 9vt(v)Je + [1 + (9/2)ut(v) + (9/2)v7t(2v)t(A — w) |Z, 
which we may write as: 
E < [23 + 9vi(v)le + 9/8[2ué(v) + 1Pt(A — w)E. 


From the digital matrix D the numerical inverse M-' of M is obtained by 
upscaling: W-! = Dt(w). If the error of M— is denoted by E(.M~—'), then clearly 


E(M~') = Et(w). By (7.1) then, 
E(M™) < [23 + Qvt(v)|t(w) + 9/8[2vt(v) + 1PH)E. 
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ON CERTAIN CLASSES OF STATISTICAL DECISION PROCEDURES 


By H. 8S. Konisn 
University of California, Berkeley 


Summary. The paper considers classes of decision procedures which in certain 
ways put a bound on the associated losses of incorrect terminal decision or cost 
of experimentation. Conditions are given under which these classes fulfill the 


conditions that Wald imposes on classes of decision procedures in his general 
theory. 


1. Introduction. We investigate a problem arising in Wald’s general theory 
of statistical decision functions (all references to Wald are to [1]), and freely 
use his notations. 

Wald considers the general case, in which sampling may be done by stages, 
by means of a decision procedure 6 fixed in advance. This procedure determines 
at each state k 2 O of experimentation whether to continue experimentation 
or not, on the basis of the observations obtained thus far (and perhaps also with 
the aid of an additional randomization experiment). In case the procedure indi- 
cates that a (k + 1)st stage is to take place, it also determines, on the same 
basis, which subset dj; of the possible collection D* of variables is to be observed 
next. In case the procedure leads to termination of experimentation, it will also 
designate a particular final decision d‘ contained in the set D‘ of preadmitted, 
possible terminal decisions. Thus, for any given procedure 6, the experimental 
and terminal decisions to be taken are random variables, which depend on the 
sample point Y = {X,, X:,--- } to be observed, and, in case of a randomized 
procedure, on the randomization experiments to be performed; we may denote 
them by 6°(X) = {8:(X), 6:(X), --- } and 6‘(X), respectively. 

Let W(F, d') denote the loss due to taking the terminal decision d‘ when F 
is the true distribution, and c(dj, --- , dj, z) the cost of observing stagewise 
the k sets of variables dj,--- , d; when the observed sample point equals 
x= {x1,%2,°:° }. Let 


P{F,y|6} = Pr{W(F, &(X)) > y| F, 3} 
and 
Q{F, 2| 6} = Pr{c(o(X), X) > z| F, 4}, 


where F and 6 after the verical bar indicate that the probabilities are to be com- 


puted under the assumption that F is the true distribution and 6 the adopted 
decision procedure. 


We adopt Wald’s assumptions 3.1 to 3.5 which are as follows. 


Received 5/29/52, revised 2/14/53. 
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(a) The class of stochastic processes F consists either of discrete processes 
only, or of absolutely continuous processes only. 

(b) The loss W is a bounded function, uniformly continuous in its second 
argument, with the modulus of continuity independent of the first argument. 

(c) The cost c cannot decrease when a stage of observation is added, is a non- 
negative unbounded function of the number of observations (uniformly in the 
other variables), and for any k, dj , --- , d; , is either a bounded function of the 
values of the observations or identically equals «. 

(d) The space D‘ of possible terminal decisions is compact (in the uniform 
topology with respect to W). 

It may be noted that in the greater part of Wald’s book it is assumed that 
W is nonnegative, but that in a few places this assumption is violated; however, 
the assumption that W is bounded is not violated, and this allows one to make 
adjustments in all proofs which make use of the nonnegativeness of W. The 
expected loss from incorrect terminal decisions and the expected cost of experi- 
mentation equal 


r,(F, 8) = [ PIF, y|a} dy 


p@® 


r(F,) = | QUF,2| 8} dz, 


respectively. 

Speaking generally, Wald’s theory is concerned with the search for procedures 
among a given class D, which in some sense minimize the “risk” r(F, 6) = 
r:(F’, 5) + ro(F, 6). It may in various cases be desirable not to admit to the com- 
petition those procedures of an otherwise naturally arising class for which (1) 
r,, OY 7, is larger than a given number, (2) the chance that W, or c, exceeds a 
given amount is larger than some small number, (3) r is not a bounded function 
of F, or, (4) the number of stages of experimentation has a positive probability 
of being unbounded. If so, it may be convenient to know whether the assumptions 
on © under which Wald derives his general results still hold for the restricted 
class if they hold for D. The present paper is addressed to that question. 


2. Statement of results. For given numbers a, 8, y = y(a), z = 2(8), yo, and 
zy , we define the following subclasses of any given class D of decision procedures 


Dia, y(a); —} = [de D: P{F, y(a) | 5} S a for all F}, 
D{—; 8, 2(8)} = {Se D: Q{F, 2(8) | 5} S 8 for all F}, 
Dy, = {6eD:r(F, 6) S yo for all F}, 


D* = {6eD:7r.(F, 5) S 2 for all F}. 
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The principles of notation adopted here are these: preceding the semicolon 
are conditions in terms of P{F, y | 6}, following the semicolon are conditions in 
terms of Q{F, z| 6}; the subscripts denote restrictions on r;, and the super- 
scripts restrictions on the cost of experimentation not stated directly in terms 
of Q{F, z| 6}. 

We inquire whether, if Wald’s general assumptions 3.6 on the class of decision 
procedures hold for D, they also hold for these subclasses. Using the notation 
given below, these assumptions can be stated as follows. 

(i) Dis convex. 

(ii) D is closed (in the regular sense of convergence defined by Wald). 

(iii) For any k, there exists c, such that p(d(&) | zu—1) , 6) vanishes whenever 
dix} is not contained in (1, --- , cy). 

(iv) p(d‘.) , | eu—1 , 6) vanishes whenever c(dé&) , t4-1) , 2°) = @ identically 
in 2’. 

(v) Given k = O and 4 in 9, there exists 5° in D such that 6°{D*| df), zu} = 
1, whatever be d{,) or xu; , while 
. a {D | dis X ny} = 5{D \ di), zm}, 
whatever be D C D'uD', h < k, d&), or x~). (Wald’s formulation of this 
assumption, which permits truncation of the decision procedure at any stage 
k = O, is unnecessarily strong.) 

We also examine from this point of view 


” = {$e D:r(F, 4) is a bounded function of F}, 


(called D, in Wald, p. 100), and 


-, (e) 


D° = {6¢:for any given F the number of stages of 
experimentation is bounded almost certainly}. 


We find that if © satisfies Wald’s assumptions 3.6, then for every set of 
numbers io , 20 , a, 8, 2 = 2(8), and all except an at most denumerable collection 
of choices of y = y(a@), this assumption holds also for 


Dy, » D*, Di —; B, 2(8)}, Dia, y(a); 0, z(0)}, D”, D”, 


(b) ¢ 


D” fa, y(a); —}, D fa, y(a); —}, Da, y(a); —}, 
and thus also for 
2” fa, y(a); 8, 2(8)}, D’ fa, y(a); B, 2(B)}, D*{a, y(a); B, 2(8)}, ete. 


Let D” denote the class of all possible decision procedures subject to (iii) 
only, and D"” the subclass of D° which satisfies (iv). Wald remarks that D”’ 
satisfies all of his assumptions 3.6, so that all classes mentioned in the preceding 
paragraph satisfy them if D = D"”. Moreover, it is easy to see that all these 
classes except D,, , D°, and D“’ {a, y(a); —} already satisfy them if D = D”. 
Note also that property (v) is not needed for any of Wald’s proofs when ®D is 


(ft 


. . »)(b IG (0)z 
contained in ®"'*”, D eo”. 
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3. Notation. We occasionally make Wald’s notation a little more explicit or 
condensed, and also introduce some new notation. Represent D* by (1, 2, ---) 
and let 


dx) = (d} yg Ae d;), dix} = d} U---. ud; . 
Write xu.) = {xi(i ¢d))}, Mu) for the set of all possible values of rx) , Muy 
for a bounded measurable subset of My) ; write x* = {2,(ied})}, M* for the 


set of all possible values of 2°. 
When F is the true distribution, let 


F(a" | rua) = Pr{X; S x(t e dt) |X; = 2;(j e die-))}, 
F(x (x}) - Pr{X; = x;(j € dtx))}. 


We recall, in a somewhat more specific form, the following definitions of 
Wald’s Sections 1.2 and 3.1.4. 


p(D‘ | 0, 6) = 6(D* | 0), 
P(diey | Xte-1) , 8) = 4(dy | O)S(d3 | di, Za) «++ (dk | Cay , Tu-1)), 


p(dix) > D' | Lik) » 6) = p(dt) | Tk-1) 5 8)5(D* | dx) ’ Le), 


q(di.) , D* | F, 6) [ p(dtx) , D* | xu), 6) dF (x2). 
Mik) 


M*(z dix) ; I k-1}) fa" € M*: c(d‘) >» U[k-1} » zr’) = 2 | dt) ’ Zte-1)}, 
D'{F, y} = {d'e D': W(F, d’) > y| F}. 


Then 


PIF, y|8} = DD --- Digi, D'{F, y} | F, 8), 
k=0 dj a 


i { p(dix) | ©(k—1) , 5) dF (x* | 2x1) } dF (x(.-1}). 
M(k-1) 


mt 2\d‘,).2[k—-1)) 


4. Convexity. 

THEOREM 1. Let D be convex. Then Dia, y(a); —} and D{—; B, 2(8)} are con- 
vex for every a, B, y = y(a), z = 2(8); Dy, and D*® are convex for every yo, 2; 
9” and D° are convex. 

Proor. Let 6, and 4. be elements of Dia, y(a); —} andO << @<1l,p=1-— #. 
To show that there exists 6 e Dia, y(a); —} such that 3.14a, b (Wald) are satis- 
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fied, note that there exists such a 6 € D since D is convex, and that this 6 actually 
is an element of D{a, y(a); —} as 


PIF, y(a) |} = DD: Ei. Pdi» D'{F, y(a)} | xt) , 051 + p52) OF (x43) 


k= 
0 ds df 


OP{F, y(a) | 6} + pP{F, y(a) | 2} S a. 


Quite similarly we show Q{F, 2(8) | 8} < B. 
Then 


r(F, 5) = 0 [ P{F,y | 6} dy + p [ P{F, y | 52} dy 
o “0 


6r:(F, 5:1) + pri(F, 52), 


so that if 5, and 6 are elements of D,, with 6 satisfying 3.14a, b (Wald), 6 € D,, ; 
and similarly we obtain the convexity of °°, and thus of 2” 
The convexity of D“°” 


= {65¢ QD: for each F there exists k’ with 


ke 


2d La(din, D'| F, 8) = 1} 


qj dy 
is immediate. 


5. Closure. 

THEOREM 2. Let D be closed, and a subset of D®. Then Dy, and D*° are closed 
for every yo , 2 ; and D” is closed. 

Proor. By Wald’s Theorem 3.2, which holds for any subset of D®, we have, 
for ¢ = 1, 2, that 5; — 4 (in Wald’s regular sense) implies that there exists a 
subsequence {6,;;} such that 


lim inf r,(F, 6;;) = ri(F, 50); 
J=o 

thus, if fori = 1, 2,---, r(F, 4;) S ri(= yo fort = 1, = % for t = 2), then 
r(F, 6) < rifort = 1, 2. 

That ©” is closed follows similarly: r(F, 6:;) S M;, (say) < ©; therefore, 
for given « > 0 there is a j such that r(F, 50) S r(F,6;;) + ¢€ S Mi; +e <@. 

TuHeEoreM 3. Let {6,;} be a sequence of decision procedures converging to a decision 
procedure 5) in the regular sense as defined by Wald. Then for all z and F, 
lim;-.Q{F, z | 63} = Q{F, z | do}. 

Proor. By Wald’s assumption 3.5, since 


mee 
/ dF (x* | t.-1) 
ME(s\d‘,).2(k-1)) 


vanishes when df, contains a sufficiently large number of elements, there exists 
for any F and z an integer k’ such that the terms in the expression for 
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1 — Q{F, z| 6} with k > k’ do not make any contribution. We shall therefore 
discuss a finite sum of terms, write 


1 — Q{F,z|6} = oe. 2 Tdiy dt ; F,z| 8}, 


k=} dj 


and show that if 6; — 49, lim;.7 |d&) ~ z| 6} = Tid ; F,z| do}. Let f be 
the elementary law corresponding to F, and fix F, z, k, dj, --- , dj, setting for 
simplicity in notation dj); = (1, ~ #. 

1) Case where F is discrete. The ¢ case where F has a finite number of jumps is 
immediate, so suppose it has an infinite number of jumps. Let T{df) ; F,z {6} = 
lim,-«8,(6) with 


nr(n) ni(n) 


s,(6) = as ‘ae {p(diry | | Tits > °° * » Trt, é) 


tp=1 ty}=1 


O7,2(Lit, Roe Fr » Dre, f (Xie, oe » Xrt~)}- 


k 
Oy2(Lity °° * Tete) = / f(a" | tu-1) dx” S 1, 
M*(2\d¢).2[k-1)) 


and for any j = 1, --- , 7, n;(n) is a nondecreasing integer-valued function of n. 
Since p(dix) | 214, , °°* » Tre, , 5:) S 1 and 


mr(n) ni(n) 


lim 7s _<- > I(aie, 5 °° y Lrt,) = 1, 


n= typ=l t)=1 


lima,-.» 8,(6;) exists and is finite for each 7. Now from the definition of regular 
; : 

convergence, limj_. D(di%) | Zit, 5 °° * 5 Cre, » Oi) = P(Aiey | Lie, °° * 5 Lee, » 50), 8O 

by Weierstrass’ rule 


Tidix); F,z | 60} = lim s,(69) = lim lim s,(6;) = lim T{d{, ; F, z | 6} 


n=O t=o N==0O 1c 


II) Case where F is absolutely continuous. 


k k 
¢2(Le-1) | dix)) - f(t) f(x | Ltx-1)) dx’, 
M*(2\d6,).=[k-1)) 


which is not greater than f(z ,_1)), satisfies the conditions of Wald’s Lemma 3.1, 
so (using the notation of Wald’s sections 3.9 and following) 


Tid‘y; F, z | 6;} [ P(dixy | Tun, 6)¢:(Tp-1 | du) dtu—y 
[k-1) 


[ ¢(2—1) | dt) dP(dt | Myx , 4). 
(k-1) 
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But by the definition of regular convergence and Lemma 3.1 of Wald, the limit 
as 7— of the last expression is 


/ ¢2(2_-) | dix») dP(dix) | Muu ’ 50), 
M(k-1) 


which equals T'{d&) ; F, 2 | do}. 

Coro.iary. If D is closed, D{ —; 8, 2(8)} is closed for every 8 and z = 2(8). 

Tueorem 4. Let {5;} be a sequence of decision procedures converging to a decision 
procedure 5, D° in the regular sense as defined by Wald. Then, for any F, the 
chance variables W(F, 5;(X)) converge to W(F, 59(X)) in distribution as i + %, so 
that limjeae P{F, y| 63} = P{F, y | 8} for all y except an at most denumerable 
collection (depending on 4). 

Proor. Fix F, k, dj , --- , dg , and show that 

lim qd) ? D'{F, y} | F, 6;) = q(dix) ’ DF, y} | F, do) 
=o 
for all y except an at most denumerable collection. 

I) Case where F is discrete. Since W(F, d') is continuous in da‘, D'{F, y} is 
open for every y. If D'{F, y} = D'{F, y’} and y’ < y, D'{F, y} has an empty 
boundary. Now consider the ordered collection | D‘{F, y}} of shrinking sets for 
increasing y. Only at an at most denumerable collection of 4 can the boundary 
B{F, y} of D'\F, y} be nonempty with q(d%), B{F, y} | F, 60) positive; on the 
other y, lim;.. P{F, y|6;} = P{F, y | 6}, by the definition of regular con- 
vergence. 

Il) Case where F is absolutely continuous. For each integer m, consider 
the finite sequence | (kj = 1, +--+ ,7r;37 = 1,--- ,m) of Wald’s Section 
3.1.4. As 
Sup : z. P(d& 9 Di, .--tm | Mu ; 6;) 


22 E 
Dy... CD {Py} 
1 m 


< P(di, , DF, y} | Muy, 6.) 
Inf =D P(dix) , Di,---tm | May, 8:) 


Dy ...4 DD'{Pv} 
1 m 
we have by the definition of convergence in the regular sense 


Sup i P(dix) , Di,.--km | Mey , 60) 


=t >! 
D, x CD {Fw} 
1 m 


< liminf P(d&) , D'{F, y} | Mu), 6) < limsup P(dt, , D'{F, y} | Muy, 43) 


t=O $==00 


= Inf 7. P(dix,) , Dk, ee Mus ’ 5p). 


Dik DD Pal 

For all except an at most denumerable set of y-intervals (which may degenerate 
to points) the left- and right-hand sides can be made to differ by less than any 
preassigned positive number, by making m large eneugh. But if, for y’ < y, 
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D'\F, y} = D‘\F, y’}, the left- and right-hand sides have the same value pro- 
vided we take m so large that the diameter of each set in { Di.,.--ke} is less than 
y — y', so that for all Muy we have 

lim P(dy ’ D'\F, y} | Mu ’ 6.) = P(dix) ’ DAF, y} | Mu , 5o) 

=a 
for all except an at most denumerable set of y. 

We now note that Wald’s Lemma 3.1 remains valid if we write 7% for 
Ti = 0,1, --- ), insert “except for an at most denumerable set Ds of y” after 
(3.50) and “except for an at most denumerable set D of y’’ after (3.51); this is 
seen by letting « and 1/c approach 0 through a denumerable set of values and 
considering in (3.58) the complement of D¢ = U,(Ds,), with D = U.U(D9. 
Consequently, as, for 7 = 0, 1,---, 


q(dix,, D'{F, y} | F,6) = [ f(xp)dP(duy , D' YF, y} | Muy 
(k) 


where f is the density corresponding to F, we have for all except an at most 
denumerable set of 
lim q(di , D'{F, y} | F,6) = gd , D'{F, y} | F, 80). 


Coro.uaRY. If D is closed, ~°” is closed, and 
DP’ fa, y(a); —}, Dia, y(a); 0, 2(0)}, and OD sa, y(a); — 


are closed for every choice of z = z(0), 2, and a, and all choices of y = y(a) ex- 
cept an at most denumerable collection. 

Proor. As above we prove that 

lim qd) ’ D' F, 6;) = qd‘) : D' F, 50) 
and obtain the closure of D” from the remark at the end of the proof of 
Theorem 1. 

For any z = 2(0), if de D}—; 0, 2(0)}, then the number of stages of experi- 
mentation is bounded almost certainly under the procedure 6, ; it follows from 
Theorem 4 that for all @ and all choices of y = y(a) except an at most denumer- 
able collection D“’ {a, y(a); —} as well as Dia, y(a); 0 2(0)} are closed. 

Let P(F, y, k’ | 6) be the probability (under F, 6) that W(F, 6‘) > y and the 
number k of stages of experimentation <k’; P(F, y | k’, 6) the probability that 
W(F, 5‘) > y, given that k < k’; P,-(F, y | 6) the probability that W(F, 5‘) > y 
and k > k’; and R(F, k’ | 6) the probability that k > k’. For i = 0,1, -- 


P{F,y|6:} = P(P, y, kb’ | 6) + Pa(F, y | = P(F, y, k’ | 6) + RF, k’ | 63), 


and for all y except an at most denumerable set P(F, y | k’, 6;) and 


1 — R(F,k’ 6) = a Dad (dix) , D' | P. 5;) 


k=0 dj 
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converge to P(F, y|k’, 6) and 1 — R(F, k’ | 59) respectively when 6; — 6, 
so that lim,.., P(F, y, k’ | 6;) = P(F, y, k’ | 60) for this set of y. Therefore we have 
for this set of y 

P(F, y, k’ |) < liminf P{F, y | 6,} 


t=20 


< limsup P{F, y|6;} < P(F, y, k’ | 60) + R(F, k’ | 60) 


with lim,-.. R(F, k’ | 8) = 0, for, since 5 ¢ D2”, (3.40) of Wald’s holds. Con- 
sequently, for all y except at an most denumerable set, lim; P{F, y | 6:} = 
P{F, y | 50} for &) « D”. Since D” is a subset of D”’, and D* and D” are closed, 
this gives the closure of both D” {a, y(a); —} and D{a, y(a); —} for all 2, @, 
and all y = y(a) except perhaps a denumerable set. 
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A DOUBLE SAMPLE TEST PROCEDURE! 


By Donaup B. OwEN 
Purdue University 


1. Summary and introduction. Three different testing procedures which 
involve a minimum of modification of the usual single sample tests of the hypo- 
theses considered are given here. Tests are made by taking samples at two 
stages for testing the mean of a normal distribution. A known standard devia- 
tion is assumed, but an extension to the case where the standard deviation is 
unknown is also given. Special examples show that tests can be chosen so that 
the expected number of observations is less than the number required for the 
ordinary single sample test and indeed can give considerable savings. The tests 
in Sections 3 and 4 give the greater savings, but the powers are more difficult 
to evaluate than the power for the test of Section 2. Also, it is a little more 
work to apply the test in Section 4. Wald in [9] has discussed a sequential test 
where the observations are taken in groups. The tests given here could be con- 
sidered very special cases of this where the number of observations is truncated 
after two groups. Romig in [7] has set up a double sampling procedure for sam- 
pling from a finite population that is approximately normal where the rejection 
points are determined by preassigned engineering or specifications limits and not 
by the normal distribtition itself as is done for the first sample of the double 
sample tests given below. Bowker and Goode in [1] give tests similar to those 
given by Romig. Chapman in [2] and Stein in [8] have discussed two sample 
tests where the object is to obtain tests with the power independent of an un- 
known variance and where there is no upper limit on the number of observations 
required. There is a definite ceiling on the number of observations required for 
the tests presented here and they have many interesting properties that make 
them very desirable from the standpoint of saving of observations and sim- 
plicity. 


2. First test procedure. A double sample test for the hypothesis H:m = mg 
against H:m < mo where m is the mean of a normal random variable, X, with 
known standard deviation, o, will be constructed. Extensions to tests of other 
hypotheses will be clearly possible. Assume that the number of observations, n, 
of a single sample test has been determined in accordance with the methods 
mentioned in [3] so as to have a given probability of Type II error for m = m, 


where m < m. Let G(x) = (2) | e*" dt and let h be defined by G(—h) = 
a < 4. Let m be the number of observations in the first sample of the double 
Received 4/4/51, revised 3/2/53. 


1 Research done in part under the sponsorship of the Office of Naval Research at the 
University of Washington. 
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sample test where m <n. Let p = m » and let @ be a positive constant defined 
by equation (1) below. 

For the double sample test, m observations, 7; ,--- , %,,, are taken and 
i, = Zale a;/m and wm = VWn,(% — mo)/o are computed. If um < —Wph — 9 
reject 1; if u, > — V ph + @accept H; andif —V ph — 65 um S - V ph + 
take n additional observations. 

For m + n observations, 7 = Zoe aki ‘n and % = v/ nl 2 — my)/¢ 
are computed. If uv. < —h reject H and if w > —h accept H. 

The Type I error of the double sample test is equal to 


G(—V ph — 6) + afl — G(—V ph — 6) — G(v/ph — 9))}. 
If the Type I error of the double sample test is to be equal to a, then 
(1) a (1 — a)G(—V ph — 6) = G(~/ph — 8) 


holds, which is the equation that defines @. 

THEOREM 2.1. For each given p and a there exists one and only one @ which 
satisfies equation (1). 

Proor. Set Y = a ‘(1 — a)G(—~V ph — 6) — G(vV/ph — 6) Then to show 
that Y has only one positive zero, note that 


ae ee nay ae 
dé V 24 ie 


and hence that Y is positive decreasing at 6 = 0, and negative increasing as 4 
approaches infinity. Since Y has only one critical point, Y has only one zero. 
The power of the double sample test is given by 


G(—V/ ph — 6+ Vpw) + [G(—-h + w)l[l — G(—V ph — 6+ V/ pw) 


— G(V ph — 6 — Vpw)] 
where w = Vn(m — m)/o. Define 
d(w) = [1 —G(-—A + w)|[G(—-V ph — 6+ vV/pw)] — [G(-—hA + w)] 

(G(V ph — 6 — V/pw)); 
that is, ¢(w) is equal to the power of the double sample test minus the power of 
the single sample test based on n observations. 

THEOREM 2.2. The function, o(w), has the following zeros: w = —~”,w 
w= h,w = 2h,andw = +2. 

Proor. Substitution in ¢(w) verifies the theorem if equation (1) is kept in 
mind for w = 0 and w = 2h. Note that uniqueness is not claimed although this 
is apparently the case. 

THEOREM 2.3. If w = u + h, then o(u + h) = —o(—u + h). 

Proor. ¢(u + h) = [1 — G(wJIG(—@ + Vpu)] — [G(w][IG(-—@ — vVpu)], 
and since G(—u) = 1 — G(u), o(u + h) = [G(—w]|G(-—@ + vVpu)] — 
[1 — G(—w)[G(-—@ — Vpu)] = —d(—u + A). 
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It is convenient now to define 
R(w) = G(—V ph — 0+ V pw) + G(r ph — 0 — V/ pw), 


that is, R(w) is the sum of the probability of rejecting H and the probability of 
accepting H at the first step of the double sample test. The expected number 
of observations, which is discussed below, is a function of R(w). 

THEOREM 2.4. The function, R(w), has a minimum with respect to w when w = 
h and R(w) is a decreasing function for w < h and an increasing function of w for 
w>h. 

Proor. This follows immediately from the derivatives of R(w). 

THEOREM 2.5. The power of the double sample test is an increasing function of w. 

Proor. The power is an increasing function up to w = h since the power is 
equal to G(—Vph — 6 + Vpw) + G(-h + w)[l — R(w)] and 
G(—~V/ph — ° + V/pw) and G(—h + w) are increasing functions of w and 
Rw) is a nonincreasing function up tow = h by Theorem 2.4. But if w = u + h, 
this means that G(—u) + o(—u + A) is a decreasing function of u for all posi- 
tive u. Hence the power which is equal to ¢(u + h) + G(u), which is equal to 
| — o(—u + h) — G(—u) by Theorem 2.3, is an increasing function for all u 
and hence for all w. 

THEOREM 2.6. The function, o(w), is an increasing function of w al w = h, 
provided p = (2r)' and h = 0.468. 

OUTLINE oF Proor. (d¢/dw), = (2x) -V/ pe — 2G(—6)|. To show that 
(do/dw), is positive for p = (2x) and h = 0.468 show that it is positive for 
p = (2x) ’, positive decreasing as p — 1, and has only one critical point for 
(2r)" < p <1 It is easy to show that (d¢/dw), — 0 through positive values 
us p — 1 by consideration of the derivatives of ~/p e” and 2G(—8) as pl. 
Next it can be shown that there exists one and only one value of z = 6+/p which 
corresponds to any critical points of (d¢/dw), . Then for all p > (2r)~' there ex- 
ists one and only one p which will give any particular value of z = 60/p. Next 
it can be shown that for p = (2)7' and 6 = 1.572 (d¢/dw), is positive, and 
then for all h = 0.468 and p = (2r)™"', @ = 1.572. 

From the foregoing theorems it appears that if h = 0.468and p = (2x)~’, the 
power of the double sample test is less than that of the corresponding single 
sample test based on n observations, G(—h + w), for 0 < w < hand w > 2h 
and is greater forw < Oandh < w < 2h. Since uniqueness of the zeros has not 
been shown, this is, of course, conjecture, but Example 2.1 below shows that 
this is true in case p = 3 and a = 0.05. This would make the double sample test 
more desirable than the single sample test based on n observations if, as is fre- 
quently the case, it is more important to reject less often for small values of w 
or more often for large values of w. In the special examples that have been com- 
puted ¢(w) has had consistently small values (less than 0.01 for w < 0 
and w > 2h), that is, the discrepancy between the powers of the single and double 
sample tests has been negligible in the tails. 

The expected number of observations for the double sample test is given by 
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E,(N) = n{l + p — R(w)]. For a fixed p, E,(N) is a maximum when w = h 
by Theorem 2.4. When the hypothesis H is true, the expected number of observa- 
tions is Ey(N) = nfl + p — a “G(—~V/ph — 8)]. The minimum of E)(N) with 
respect to p for a = 0.01, a = 0.05 and a = 0.10 is for p = 0.505, p = 0.524 
and p = 0.5003, approximately, respectively. 


TABLE 2.1 
—~Vph—@ | —Vph+e 4 
For a = 0.05 


— 2.3006 +0.8293 . 9650 0.009118 
—2.1331 +0.4882 .3107 0.008988 
—2.0174 +0.2155 1165 0.008566 
—1.8707 —0.2099 . 8304 0.007402 
—1.7847 —0.5415 .6216 0.006210 
—1.7308 —0.8175 -4566 0.005158 
— 1.6956 — 1.0569 .3193 0.004277 
— 1.6826 —1.1664 .2581 0.003899 
— 1.6720 —1.2705 - 2007 0.003557 
—1.6559 —1.4651 .0954 0.002972 


coocoooooeceo 


For a@ 


0.01 
. 7984 0.0013253 
-5205 0 .0012304 
3073 0 .0010936 
0 .0008073 
0 .0005710 
0 .0003983 
0) .0002767 
0 .0002314 
0 .0001940 
0 .0001379 


—2.8387 +0.7580 
—2.6837 +0 .3573 
—2.5816 +0.0331 
—2.4616 —0.4809 
—2.3991 —0.8909 
—2.3651 — 1.2388 
—2.3461 —1.5466 
—2.3400 —1.6891 
— 2.3356 — 1.8258 
— 2.3296 —2.0842 


0.: 
0.: 
0. 
0. 
0. 
0. 
0. 
0. 
0. 
0. 


eecocoeoceocr = 


This test is obviously not as efficient as it could be since when it is necessary to 
take the second sample no use of the first sample is made in the second test. 
Even so, Example 2.1 below shows that for p equal to one-half, the expected 
number of observations is considerably less than the number required for the 
single sample test based on n observations and the power has some desirable 
properties over the power of the single sample test. In sections 3 and 4 the above 
procedure is modified so that the test at the second stage makes use of the first 
set of observations. 


Table 2.1 is a tabulation of the rejection and acceptance points for various 
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values of p and a = 0.05 and 0.01, together with a tabulation of 6. The quantity, 
k, is defined in Section 4. For a = 0.05, k has a maximum when p = 0.2090, 
6 = 1.5129 and k = 0.009126. The quantity, r, is defined in Section 3. 
EXAMPLE 2.1. If a = 0.05 and p = 3, then 6 = 0.6216. For purposes of com- 
parison the powers of various single sample tests are listed in Table 2.2 beside 
the power of the double sample test. The column headed power is the power of 
the double sample test as outlined in this section. G, = G(—h + w) is the power 
of a single sample test based on n observations. G. = G(—h + 0.9828w) is the 
power of a single sample test based on 0.9658n observations, that is, on the maxi- 
mum expected number of observations of the double sample test. G; = 
G(—h + 0.8700w) is the power of a single sample test based on 0.7569n observa- 
tions, that is, on the expected number of observations of the double sample 


TABLE 2.2 


E w(NV) Power G, G, G; G, 


0.5245n 0.0007 0.0001 0.0002 0.0004 0.0010 
0.5995n 0.0068 0.0041 0.0043 0.0060 0.0078 
0.7569n 0.0500 0.0500 0.0500 0.0500 0.0500 
0.8493n 0.1203 0.1261 0.1244 | 0.1132 0.1182 
0.9252n 0.2509 0.2595 0.2540 0.2192 0.2473 
0.9658n 0.5000 0.5000 0.4887 0.4154 0.4887 
0.9531n 0.6449 0.6387 0.6258 0.5379 0.6208 
0.7569n 0.9500 0.9500 0.9439 0.8882 0.8882 
0.6372n | 0.9876 0.9907 0.9889 0.9668 0.9392 
0.5386n 0.9986 0.9996 0.9995 0.9966 0.9785 


i J 
—_e CO OC RK tO 


2 
3 
4 
5 


test when the hypothesis H is true. G; = G[—h + 6(w)] gives the values of the 
various single sample power curves based on the expected number of observa- 
tions for the double sample test for the particular alternative at hand, that is, 
6(w) = wV E..(N)/n, where E,,(N) is the expected number of observations for 
the double sample test. Note that the power of the double sample test is every- 
where better than the power, G[—h + 6(w)]. Hence if power is lost for any alter- 
native where w is positive it is not lost in greater measure than is caused by taking 
fewer observations on the average. 

The choice of the rejection and acceptance intervals was an intuitive one in 
the first place, and although investigation into the optimum such choice indicates 
that the one that was made is probably the best that can be made from the stand- 
point of minimum expected number of observations balanced against a uniformly 
powerful test, no conclusive results have been obtained in this direction. 


3. Second (+) test procedure. The first part of the tests in this and the follow- 
ing section is the same as the test in Section 2. That is, m; observations are taken 
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(m < n) and & = Dots rim, p = m/n, and m = V ni(E, — m)/o are 
computed. If wm < —~V/ph — @ reject H;if wu > —Vph + 6 accept H; and if 
—-Vph-O05 m8 —vV ph + @ take n. additional observations, where no. = 
n—m. 

For n observations in all #, = 2 os x,/n, and u, = Wn(%, — m)/o are 
computed. If u, S 7 reject H, and if u, > + accept H, where 7 is determined 
from the equation 


ET ue — 2>/ pw + v" 
awit —?p [ c sa (- ee) du dv 


Vv ph— 
= all — G(—V ph — 0) — G(V/ph — 8)] 


since the joint distribution of the random variables U, and U, is bivariate nor- 
mal with zero means and unit variances and correlation equal to /p. A similar 
test procedure is given in [1]. The quantity, 7, is tabulated for a = 0.05 in Table 
2.1 and was obtained by interpolation in the bivariate normal table given in 
[6]. For a = 0.01 and other significance levels, however, the tables given in 
[6] are not extensive enough to obtain 7 for most values of p. Section 4 gives an 
alternative procedure which can be used in this case. 
The power of the test given in this section is equal to 


oe 2 1 r+ V/ Ph+O+/ DP 
G(-Vah-0+ Ve + al, 


—/ Ph—0+-,/ pu 


et etree 2 
_ (- w — 2Vpw +r 2 aw), 
2(1 — p) 


and may be obtained in many cases from the bivariate normal table given in [6]. 

The expected number of observations for this double sample test is given by 
E..(N) = n{l — (1 — p)R(w)). For a fixed p, E..(N) is a maximum when w = h 
by Theorem 2.4. When the hypothesis H is true, 


E(N) = nfl — a (1 — p)G(—V ph — 4)). 


The minimum of £)(.V) with respect to p for a = 0.01, 0.05, and 0.10 is approxi- 
mately for p = 0.443, p = 0.457, and p = 0.46, respectively. The power and 
expected number of observations for this test for a = 0.05 and p = 3 are tabu- 
lated in Table 4.1. 


4. Third (J) test procedure. This test is primarily an alternative procedure in 
case rt cannot be obtained for the procedure in Section 3. At the first stage go 
through the same steps outlined in Sections 2 and 3 and at the second stage take 
m=n— ny additional observations and for n observations in all compute 


, n 


ie = Daimen tt x;/No, Us o V ne (€3 — m)/o, ji = G(m), je = G(us), a 
G(—~Vph — 0), b = G(—V/ ph + 8), and q = jij , where the random variable 
() has the distribution given by Theorem 4.1 below. Define k by Pr(Q < k) = a 
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and reject H if gq S k and accept H if g > k. If the size of the double sample 
test is to be @ as in Sections 2 and 3, Theorem 2.1 applies once more. 

THeoreM 4.1. The cumulative distribution function for the random variable, 
Q, whena S J, S b ts given by 


(q(log b — log a) 


0 
b-—a for0's 


PriQ<q@ =! 
ql + log b — log gq) — a 


b—-—a frasqs 


Proor. The random variable, J, has the rectangular distribution over the 
interval 0 to 1 and J; given J, has the rectangular distribution over the interval 
atob. lf Q = J\J2, the joint distribution of Q and J; is given by 1/(b — a)J; 
where the region of definition is the trapezoid bounded by Q = 0, Q = J, , 
J; = a, and J; = b. Integrate out J; and then integrate again to obtain the 
cumulative distribution function (2). 

The power of this double sample test is given by 


G(—Vph — 06+ V pw) 


+ q*{1 — G(—V/ ph — 6+ V pw) — G(V/ ph — 6 — Vpw)] 


where q* = Pr(Q S k given thatthe mean is m anda S J; S b). An exact 
formula for q* can be found using the same process as that used in Theorem 4.1, 
but the result is complicated and is not practical to use in computations. A 
lower bound for g* can be found as follows. 

Let J; = G(U, + V pw), Jy = G(U2 + V1 — pw), 


a’ = G(—V ph — 06+ V pw), b' = G(—-V ph + 64 V pw), 


and Q’ = J3J,. When the mean is m, Q’ has the same distribution as Q, given 
in Theorem 4.1, but with a and 6 replaced by a’ and b’, respectively. Determine 
the pair of values of (U, , U2) that minimize the product J;J4 such that J,J2 = k 
and —Vph — 6S Uy S —V ph + 86 (not necessarily the same pair for each 
w). Denote the minimum value of J3J4 by ki, . Then g* = Pr(Q’ Ss ki) 
the hypothesis is always rejected if Q’ < ki, . 

The expected number of observations for this test is the same as for the test 
in Section 3 with the same maximum and minimum values. 

EXAMPLE 4.1. If a = 0.05 and p = 3}, 6 = 0.6216, r = —1.96, and k = 
0.006210. For purposes of comparison in this case the powers of various single 
sample tests and the expected number of observations for the + and J double 
sample tests are listed in Table 4.1. The column headed L. B. Power ./-Test 
refers to the lower bound of the power of the double sample (./) test as outlined 
in this section. G; = G(—h + w) is the power of a single sample test based on 
n observations. G; = G(—h + 0.8561w) is the power of a single sample test 
based on 0.7329n observations, that is, on the maximum expected number of 
observations of the double sample (J and r) tests. G = G(—h + 0.7927w) is 


, since 





456 DONALD B. OWEN 


the power of a single sample test based on 0.6284n observations, that is, on the 
expected number of observations of the double sample tests when the hypothesis 
H is true. G; = G[—h + y(w)] gives the values of the various single sample 
power curves based on the expected number of observations for the J and r 
double sample tests for the particular alternative at hand, that is, y(w) = 
wVE,(N)/n, and E,,(N) is the expected number of observations for the J and r 
double sample tests. Note that the power of both of the double sample tests is 
everywhere better than G[—h + y(w)]. Hence, if power is lost using either of 
the double sample tests for any alternative where w is positive, it is not lost in 
greater measure than is caused by taking fewer observations on the average. 


TABLE 4.1 
L. B. 


E.AN) Power G, rs Gs 
J-Test 


& 


| 
| 


| 
— 


| 


0.0041 ‘ 0.0074 
0.0500 0. 0.0500 
0.1261 , 0.1059 
0.2595 0.212 0.1971 
0.5000 0. 0.3666 
0.6387 | 0.5: 0.4763 
0.9123 ‘ 0.7683 
0.9907 ‘ 0.9365 
0.9996 0.98 0.9898 


.5498n 
.6284n 
.6746n 
.7126n 
-7329n 
.7265n 
.6556n 
.5686n 
.5193n 


or 
Noe gr 
WNW ds © 


or 


ooocoocoo ©} 
oor nN KS 
© ‘Ss 


1c 
AS 


© 
coocoooco 


0 
0. 
1 
Si 
2 
3 
4 
5 


5. Test of the Student hypothesis. A test of the hypothesis considered in 
Section 2 when the standard deviation is unknown can be constructed by simply 
making the probabilities of accepting and rejecting equal to the corresponding 
probabilities given in Section 2. That is, let 


_{(n+1 
1 I 2 t t —}(n+1) 
[ (1 oe ) dt, 
— © t 


V ne r =) 
2 

and let \ and 7 be defined by S,,1(—A) = G(—V/ph — @), Smal(—n) = 
G(—~V/ph + 6). Replace o’ in the test procedure of Section 2 by the corresponding 
unbiased estimates sj = D024 (x; — #)?/(m — 1) and 3 = OM", (a; — &)*/ 
(n — 1). For n observations reject H if u, < —\ and accept H if wu > —n. Take 
n additional observations if —A S$ m S —». For n + n observations make 
the usual test of size a using the last n observations only. For given n, , a and 
p, \ and » can be obtained from [4]. The power of the test and the expected 
number of observations can be easily computed from the tables given in [5]. 


S.(@) = 
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Tests similar to those given in Sections 3 and 4 can obviously be constructed 
for the Student hypothesis. Extension to any other hypothesis can easily be 
effected by assignirig the probability of rejecting and the probability of accept- 
ing equal to the corresponding probabilities obtained using the normal test, as 
was done for the Student hypothesis above. 

The author wishes to thank Professor Douglas G. Chapman of the Laboratory 


of Statistical Research at the University of Washington for his helpful advice 
during the preparation of this paper. 
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SOME TESTS BASED ON ORDERED OBSERVATIONS FROM TWO 
EXPONENTIAL POPULATIONS' 


By BENJAMIN EpsTeEIN AND Cuta Kuer Tsao 
Wayne University 
1. Introduction. Let ru © re S +--+ S Am, , and rm S 2 


~ Tee 
= 22 


two random samples (S,,, and S,,) from populations having p.d-f.’s f(z; A1, 41) 
and f(r; Ay, 8) respectively, where 


S ii7 s Teno ’ be 


‘ 1 ; 
(1) f(z; A, 0) = — exp [—(x — A)/6]. 
6 
Let S,, and S,, be the sets of the first r; and r. smallest observations of S,, and 
Sn, respectively. Then the p.d.f.’s of S,, and S,, are given, say, by 


gla, °°* » Xie, 3 Ar, &) and g(%1,---, vere 3 Ay, A), 


where 


g(ay, ta, °**, te 3 A, 9) 


(2) 


= ~ Tig oP | 12 le, ~ A) + fe ~ os, ~ a). 


(n 6 


) 

The likelihood ratio tests based on the complete sets, S,, and S,, are special 
cases of those obtained by Sukhatme [2], [3]. It can be shown that similar likeli- 
hood ratio tests based on S,, and S,, may be obtained by following Sukhatme’s 
procedure [2]. In this paper these likelihood ratio tests are reduced to equivalent 
tests which are expressed in terms of the well known chi square and Snedecor’s 
F distributions. Furthermore, some of the tests obtained in this paper can be 
extended to k-sample tests. 

Since percentage points for x° and F distributions are tabled, tests involving 
these random variables are useful in applications. We remark that the likelihood 
ratio test for the hypothesis H; (see Section 3) has been obtained by Paulson [1]. 

The results of this paper can be used in the field of life testing. A characteristic 
feature of such tests is that observations become available in order of magnitude. 


t=1 


The assumption of an exponential distribution of life is a reasonable one to make 
in some applications (e.g., electron tube life). The parameter A can be interpreted 
as minimum life (also called sensitivity limit in fatigue failure problems) and the 
parameter @ is the mean life measured from A as a starting point. From the life 
test point of view one has a sample of size n, from population 1 and a sample of 
size Nz from population 2, the two populations one wishes to compare. Procedures 


Received 11/24/52, revised 3/14/53. 
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are given for testing various hypotheses regarding the 1, and 6; (¢ = 1, 2) based 
on information which has been truncated in the sense that one has only the first 
r; failure times from the sample of size n, (population 1) and the first r. failure 
times from the sample of size n. (population 2). r; and re (as well as m; and ne) are 
assumed to be preassigned. 


2. Preliminary lemmas. We give several lemmas which were used to obtain 
the distributions of the reduced statistics. Lemmas 1 and 2 can be proved by 
the use of characteristic functions and their proofs are omitted. Proofs of Lemmas 
3 to 6 are given. 

In Lemmas | and 2 below, we let z; S wm S --- Sax,S--- S2,bearandom 
sample from a population having p.d.f. (1) and we define statistics u, v, and h as, 


(3) bs (x; — A) + (n — r\(z, — ay]. 


t=1 


(4) P (x; — 4) + (n — r)(x, - x) |. 


—- 


(5) 


Lemma 1. « is distributed as x°(2r). 

Lemma 2. v and h are independently distributed as x(2r — 2) and x°(2) re- 
spectively. 

Lemmas 3 to 6 deal with the case of two samples. The statistics u, , v, and 
U., v2 are defined as in (3) and (4). Three additional variables w,, w., and w 


are defined in (6), (7), and (8). 


>» 2n, 
(6) wy a (ay == Xa), . vy a Xe 


1 
Zn» 


(7) a = 0. 


(Xo — 211i), 2 - Tu 


(8) w=w,, when 2 > 2% and >= w., When 2 > 2%. 
LemMA 3. If A; = As, then 


‘ No/ Os 
9 a Ot ee 
° in ” 1/80; + 2/62 


and 


ny A, 


(10 Priz. > 2%) = ————_——-.. 
\ ) = Ny 0; + Hie Bo 
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PROOF. 


11 
n Me - | 6 A 
Pr(z; > In) = [ [. ~— (m1 /63) (244 1)— (mq / 69) (291 » dzy diy 
0100 
= 2/4. 
71/6; + 12/62 
Hence, 


mi /6, 
Pr(vy > 2n) = 1 — Pr(ay > 22) = ————— . 
( 21 un) ( ll 21) 13/04 abe n2/O2 
Lemma 4. Jf A; = Ao, then both w; (given that x1 > 2X) and we (given that 
Ie > ty) are distributed as x’(2) 
ProorF. Since A; = A2, w; can be written as 


9 
om = (ru * A) = (20 = A,)]. 
] 


Consequently, 
A 
tn — Ay = on, w+ (ra = A2). 


Let an — A; = y, and x, — Ay = ye, then the condition that r, > ry, is equiva- 
lent to y; > ye . Since the joint distribution of y; and y is, say 
nm 1 —(n n 
(11) f(y . Yo) > ro iileaiae lita > Yr, Y2 > O, 
we have 
N2/ 8. —w 2 

12 P < 2) = —————_ {1 — "> 

(12) rw: S wo, fi > Yo) os mal, | € ] 
According to Lemma 3 

2/6 

13 Pr(. 2) = Pr(z a 
(13) Yi > Yo) (a1 > In) a/b: + a): 


Therefore, 
(14) Pr(w: S wo | yi > yw) = 1 — e*” 


which is the cumulative form of the x’ distribution with 2 d.f. This completes the 
proof of the first assertion in Lemma 4. The proof for the second assertion is 
similar. 

Lema 5. If A; = Ao, then w is distributed as x"(2). 

Proor. Since 


(15) Pr(w S w) = Pr(wi S wo, yi > Yr) + Pr(we S wo, ys < ys) 
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then by (12) 
(16) Pr(w S w) = 1 - 


which proves Lemma 5. 

Lemma 6. If Ai = Az, then (a) 11, v , and w, (given that xy, > x), or (b) 1, 
U2 , and We (given that x1, < Xm), or (c) v1 , v2 , and w are independently distributed as 
x'(2r: — 2), x*(2re — 2) and x°(2) respectively. 

Proor. By Lemma 2, 1 and v2 are each independent of both z,, and x2; and the 
results follow using Lemma 5. 


3. Likelihood ratio tests and equivalent reduced tests. The various hypotheses 
and their associated likelihood ratio and equivalent reduced tests are listed below 
in Sections A, B, and C. One of the derivations will be given in Section D. 
Some properties of the tests are given in Section E. 

A. Statement of hypotheses. 
a) H,: To test 0; = 0 
(assuming A; and A, are known). 
b) H, : To test # = 02 
(assuming A; = A., but that the common value is unknown). 
c) H;: To test 6: = 62. 
d) H,: To test Ay = A, 
(assuming 6; and 6 are known). 
e) H,: To test A, = Az 
(assuming 6, = 6, but that the common value is unknown). 
f) He: To test A; = As. 
g) H;: To test A; = A. and 0, = @&. 
B. Likelihood ratio tests. 
In a), b) and c) below we let 


(19) 
a) For H,: 
yr 
(20) A = K[a +o)" (1 + +) | 
1 


where 


a= P (2; _ A,) + (nz ame 12) (Lor, _ 4.)|/ 


j=l 


(21) . 
bP (1m; — Ai) + (m — 71) (Zr, = Ay | . 


j=l 
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r2 |-1 
k{a+ey*(1 +4) | a 
/ Se iat’ , 
K| (1+ 2) (1+ 0)" >. Al 
C2 


a bP (t23 — In) + (nz: — 12) (Xorg = rw) |/ 
j=1 


ri 
b (1 — Zn) + (m — 71) (Xr, 
7=1 
r " 
= b (a1; — tn) + (m — 11)(tr, — ta) |/ 
j=1 
= ‘ , 
b (to; — In) + (m — ro) (arg — ra) | . 


I=1 


b) For 


where 


ce) For 


rg-|—1 
(24) 2=K t + ¢;)" (1 + ') 
Cc a 


where 


= |= (%2; — In) + (m2 — 12)(Tx, — ra) |/ 
j=l 


ri 
| (11; — Zn) + (m — ri) (iy = ru) | : 


j=1 


(25) 


d) For H, : 
(26) y¥ =e? 
where 
(27) 

e) For H;: 


(1 + Cs) — if In > Xa 
(28) 


’ —(ry +79 eas 
(1 + C5) ah. ’ u Tu < T2 


2 ri 
ame ean _ ral /(X bP (x45 _— La) ot (n; _ ri)(Xirg _ r.) }) 
i=l j=) 
[roa = mo /(X |= (x; - a) ao (ns — ri)(Xirg _ ra) }) 
i=1 L j=! - 


where 
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f) For H, : 


(29) (1 +e), if tn > Tn 


(1 + c@) 7, if zu < tn 
where 


= [m(ru = ew /| (21; — fu) + (m — r1) (Xr, —_ rx) | 


I=1 


(31) 


= [n(rn — owl /| (x2; — In) + (m — 12) (or, = ra) | . 


where 


| (ri; — ta) + (nm: - rs) (rir, _ ra) | 


1 
r d=1 


2 ri 

se 7 | (x;; — A) + (nj — ri) (te, — 4)| 
Ty + Te s=1 Ly=1 

and where A = min (7711 , X21). 

C. Reduced Tests. 

By the use of the lemmas in Section 2, \1, Ax, --- , Ae can be reduced to the 
following equivalent tests having the corresponding distributions (see Table 1). 
The authors have not succeeded in reducing A; to an F-test or a x°-test (as was 
possible in A;, Ae, --* , As). We should like to mention, however, that in [3] 
Sukhatme found a edf for A; in the special case where 7; = ” and rz = ne. If 
further m = nz the cdf he obtained involves the inverse hyperbolic cosine. Un- 
doubtedly one can obtain a similar result for the cdf of \; especially if r; = re. 

In Table 1, numbers in the ‘critical regions” column indicate that the reduced 
tests obtained may be either one-sided or two-sided. For example, consider the 
case where r; = ro = 10 and a = .05. Then for the various H, ,z = 1, 2, 3, 4, 5, 6, 
we have the following critical regions which are summarized for convenience in 
Table 2. 

It should be noted that under H., H,, Hs, and He. the distribution of the 
appropriate A criteria consists of two parts, that is, depending on whether 2 > 
rq OY In < Xn. In order to maintain a \ criterion in the form 0 < A < e, for 
some appropriate constant c, we should use (if r; = re) eritical regions of size a 
in each of the two parts. If r; # m2, then, in the case of 1, one has to abandon 
the criterion in the simple form just given, because Pr(2ry, > 2) is unknown. 
In order to obtain a test which is of size a, the statistician is forced to use critical 
regions of size a in each of the two parts. If 7) ~ rz. and one is dealing with H , 
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then it is possible (but not easy) to maintain the \ criterion in its usual form. 
However it seems appropriate on practical grounds to use critical regions of equal 
size a in each of the two parts. 

D. Derivation of the test under H2 . 


TABLE 1 


Equivalent Reduced Tests Distributions 


Ne, F(2rs , 2r;) 


To 


1S e,if mn <a F(2r. , 2r; — 2) 
2 


~~ = , ° ‘ ‘ 
_ C2, if Ya < Ly 2re - 2) 
ry 


3. 
3 
ro — 1 


C4 . (1) 


2 2re — 4 ; 
elcome = Cs, if tu > tn (1) 


2r, + 2r.2 — 4 
> 


- 


C3 ,if ta > Xu + 2 (1) 


2r; —2 ° 
fe z Ce, if tu > Ian 2, « (1) 


, 2re —~2>,. 
fe s Co, if ta > Iu F(2, 2) (1) 


~ 





Since the derivations are similar in all cases, it would be sufficient to give one 
of them as an illustration. For the case H, , the proof is as follows. 
Assuming A; = A, = A, then the likelihood function is given by 


h(an,--- » Liv, » Tar, °** y Lore 3 A, 1, Oo) 


(34) - n;! 


= II 


1 (n; = ri)! 


( ives 
giexP\ — 5 | 2 eo — A) + (n; — ri)(te, - ay}. 
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TABLE 2 
Critical Regions 


fi > 246 or fi 


fo > 2.56 <=s= when tn < 2 


, . 
fe > 2. <ss when za < tu 


fs 


fa 


when 
and 
when In > Ly 


when 2 > 2 
and 
> 3.55 when 2 > 21 


, 


te 


In the whole parameter space 2: A, 6; , 62 > O we obtain the maximum likeli- 
hood estimates 


(35) A = min. (zy, 2) 


(36) 6; ap» (x3; = A) + (n; = ri) (Liv; ia A) . 


Tr; j=l 
In the subspace w: A > 0, 6, = 6 = 6 > 0 we have 


(37) A = min (ry , tm) 


. 1 fT . » 
(38) § = >| (xi; — A) + (n: — ri)(te, - ay]. 
Ty + fe i=1 Limi 
Hence, it can be easily verified that the likelihood ratio is given by \2 as in (22). 
Further, \2 is a function of c, (or cz), which under H, can be written as 


(39) q = U2 + Ws 


V1 


(40) a Vv; + Ww; 
f V» 


9 if Wu < Ia 


9 if Ia < Zy. 
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Consequently, the reduced test given in Table I follows from Lemma 6 and other 
standard results on the distribution of the sum and ratio of independent chi 
squares. 
E. Some properties of the various tests. 

A number of properties for the various reduced tests are of interest. Some of 
them are: 

(a) The tests (critical regions on A) for H,, H;, Hy, Hs, do not depend on 
nm, and n. This statement is also true for H, and H, if rm; = re. 

(b) The power of the tests for H, and H; and also for H, if r; = rz does not 
depend on m or n. 

(c) The tests are unbiased. 

(d) The power of the tests for Hz and H; is independent of 4; and A2. 

These properties are fairly obvious. There are other properties which can be 
discovered by a more detailed investigation. 
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NOTES 


POWER FUNCIIONS OF THE SIGN TEST AND POWER EFFICIENCY 
FOR NORMAL ALTERNATIVES' 


By W. J. Drxon 
University of Oregon 
1. Summary. Power functions are tabulated for the sign test for various 
sample sizes and a near .05 and .01. Several of these power functions are com- 
pared with the power function of the t-test for samples from normal populations 
by means of a power efficiency function. The results indicate decreasing power 


efficiency for increasing sample size, for increasing level of significance and for 
increasing alternative. 


2. Power function. The power of the two-sided sign test for level of significance, 
a, 1s given by: 


(1) Mp) = > (*) [p’(1 — p)* 7 + p* “(1 — p) 


j=0 


where 7 is the largest integer such that 


> (*) (1/2)* < (1/2) 


j=0 


and N is considered fixed [5]. Here, p is the alternative population proportion. 
Values for \(p) may be obtained readily from a table of the cumulative binomial 
[1] or tables of the incomplete beta function [2] since 


} G x(1 — 2)*~? = LAN — i,i + 1). 


j==() 


Beyond the range of these tables the approximation of Camp [3] can be used 
with great accuracy. The maximum 7 which satisfies (2) is tabulated as r in 
Table I of reference [5] for a = .01 and a = .05. Tables I and II of this paper 
give the power for these critical values. Since p = .50 is the null hypothesis, 
the values in the column headed p = .50 in Tables I and II of this paper give 
the actual level of significance (<.01 or $.05) of each test. At the foot of the 
tables are the normal alternatives corresponding to the alternative p, that is, 
6 is defined by the relation 1 — F(é) = p where F(z) is the cumulative zero mean 
unit variance normal distribution. For normal alternatives Tables I and II may 
be entered either with p or 6. For nonnormal alternatives the tables must, of 
course, be entered with p. 
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TABLE I 
Power for Sign Test (a S | 


45 40 3s | 30 25 05 
‘35 60 | 165 "70 75 ‘85 : ‘95 


.06878} .08800} . 12128) . 17050) .23828) . j .77378 
-03125) .03598) .05075) 07726} . 11838]. 17822) . ; i . 73509 
01562) .01896) .02963) .04967} .08257).13354| . ‘ : .69834 
-00781) .01005) .01745) .03209} 05771). 10013) . : ‘ .66342 
03906) .04760) .07435) . 12248) . 19644). ‘ ; 4 .92879 


-02148} .02776] .04804) .08649} . 14945). , : , 91386 
. 10938} . 12695) . 17958} . 26643) .38437] . ‘ : a - 98850 
01172) .01614) .03097) .06079} . 11304]. .32212 | . . 68 .89811 
-03857| .05002) .08625) . 15214). 25302) . -55835 | . - 888 -98043 
-02246} .03105) .05922) . 11354). 20255) . .50165 | . 4 2 .97549 
.01294 .01916) .04040) .08407) . 16086) . .44805 |. 4 . 96995 





-03516} .04875) .09243} . 17318} . 29696) . -64816 | . ‘ .99453 
-02127| .03158} .06609) . 13406} . 24589) . -59813 | . ; - 99300 
-04904} .06820) . 12852) .23542) 38879). 75822 | . ‘ .99884 
-03088} .04598) .09545) . 18888) . 33269) . -71635 | . _ -99845 
01921} .03074) .07025) . 15011} 28224) . .67329  .§ 964! .99799 


-01182] .02039} .05127) . 11825}. 23751). -62965 | . ‘ .99743 
-04139} 06177) . 12721) .24571).41641). .80421 | . . .99967 
- 11532} . 15135) . 25648) .41815} .60827). .91331 | . ‘ . 99997 
7 ~~ .04329) .06968) . 15476) .30626) .51187). .89088 | . : . 99998 


9 | .04277) .07442) . 17714) .35764) . 58882) . .93891 | . : .00000- 
11 | .04096) 07712). 19577] .40198) .65156 . 96564 
13 | 03848} .07848) . 21156) .44079) . 70325) . 98059 
15 .03570) .07894) . 22517) .47519} 74622). . 98900 
17 = .03284) .07877) . 23706) . 50598) .78219] . .99374 


20 .02734] .07722} . 25689) . 55903]. 83818). 99796 

26 | .04139). 11635} 36009) 69503) .92220) . .99974 1 

30 —.03299] . 10895} . 36877) .72353} .94125}. 99991 1. 

35, .04460). 2} .46008] . 81223) .97256} . .00000- 
100 39 ..03520}. 13519] 46206) .82758] .97900! . .00000- 


Normal alter- 
natives 

6 12% 2534 |.3853 |.5244 |.6745 | .8416 1.0364 1.2816 1.6449 

/26 . |.3583 |.5449 |.7416 |.9539 |1.1902 1.4657 1.8124 2.3262 








3. Power efficiency. Discussion of the power of the sign test for normal alterna- 
tives was given in [4] for large V. This paper obtains 100 (2/7) = 63.7 per cent 
as the efficiency. Reference [5], by a rough coincidence of the power function of 
the sign test for a sample of N observations with the power function of the t-test 
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for some smaller sample size obtained ratios of sample sizes of .67 for N = 18 
and .65 for N = 44. This ratio of sample sizes is defined to be power efficiency 
by Walsh [12]. He defines as equivalent power curves those whose average height 
is the same. For finite N an essential difficulty arises, since the power curves 


TABLE II 
Power for Sign Test (a S . 


40 «35 -80 
-60 -65 -70 


01005 .01745 .03209' .05771 . : 27 43047 
-00536 .01034' .02079 .04037,.07509) . ; 38742 
.00287 .00615 .01349 .02825 .05631 . ; 34868 


.00155, .00367 .00876, .01978 .04224 .085¢ ; 31381 
.00937 .01991 .04252 .08504 .15838 . / 65900 
.00543 .01276 .02961) .06367 .12671 23: s 62135 
.00314! .00816 .02053' .04748).10097, . a -58463 
.01176 | .02739| .06179) . 12684 .23609  . ; 81594 


—— =m © 


to 


.00719, .01846 .04511 .09936 .19711, . .78925 
00437 .01238 .03273 .07739|.16370) .: ; .76180 
.01298 .03300) .07830!. 16455 .30569 . .90180 
.00825' .02306'.05915).13317).26309 . | 68415 | .88500 
00521 .01601!.04438 .10709 .22516) . .86705 


2 
2 
3 
3 
3 


.00898, .02942, .08263). 19349 .37828, . d - 96660 
.01253 .04357 .12377, .28138 .51429) .76079 | . . 99222 
.01565) 05757). 16510) .36458 .62632 
.01829 .07098! . 20528) .44061 .71514 
.02048 .08365 . 24370) .50875, .78408 
.02226, 09552). 28010) .56918 .83692 


60 d .02483 .11697 .34678) .66916 .90752 
23 i. .02637 .13568 .40577) .74592) .94774 
80 28 |. .04485 .21312,.55120) .86331 .98337 
90 32 |. .04566 .22674 .59138) .89569 .99075. . 
100 36 |. .04300 .23868 .62692).92011 .99482 


Normal alter- 

natives 
6 .1257 .2534 .3853 .5244 .6745 =.8416 1.0364 1.2816 1.6449 
/26 17 .3583 |.5449 .7416 .9539 1.1902 1.4657 1.8124 2.3262 


differ in shape. The equivalence “by sight’’ or by an averaging process disguises 
these differences in shape. It would seem more realistic to define a power efficiency 
function which gives the power efficiency for each alternative. This function 
has been obtained for the sign test for N = 5, 10, 20 and is given in Fig. 1. The 
power function of the test was compared for a corresponding to particular exact 
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@=0.00196 


a@=0.02148 


a=!0938 
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values of a for the sign test. These curves show decreasing power efficiency for 
increasing .V, for increasing a, and for increasing alternative. The alternative, 
6, is a shift in mean in standard deviation units of observation differences. For 
samples from two normal! populations with means 4; and yw. and standard devia- 
tions both equal too we have V26=\m— m | /o. Limiting power efficiencies 
for large 6 were computed for the curves graphed. The alternatives correspond- 
ing to power equal .50, .90 and .99 are indicated on each curve. Table III lists 
the power efficiency for the cases studied and the power functions for the sign 
test are in Tables I and Il. Cases used in this comparison but not satisfying the 


TABLE III 
Power Efficiency of the Sign Test 


Alternatives N=5 10 10 10 
p 6 a = .0625 .0020 .0215 .1094 
.50 0 96. 94.0 85.0 76. 
45 . 1257 96. 93. 84.9 76. 
-40 . 2534 96. , 84. 76. 
.30 .3853 96. 8 84.5 76. 
.30 .5244 95.8 .1 84.0 76. 
.20 .6745 95. 91.2 ‘ 75.6 
.20 . 8416 95. 74.§ 
15 .0364 94. 74. 
.10 . 2816 93 .¢ iw 
05 .6449 92.8 86. 
.03 .8808 92. $4 
Ol 2.3263 90.6 82. 
-005 2.5758 89.§ 81. 
.001 3.0902 38. 78. 
o 80.6 58. 


ao 
s19N Ss SF SI SS Ss 
— Wee oT oI & 
,oaxntar si s7 5] 
10 © 


jr) 
> 
~ 
= 


‘ 


70.5 


— i CO DS bo 


1~] 


e 
~ 


04.2 52.7 51.8 


requirement of largest a S .01 or S .05 are indicated by parentheses in Table I. 
Additional powers not tabulated there are: 


p = .03 -O1 -005 -O0O1 


5 85873 -95099 .97525 .99501 
10 ( . 73742 -90438 95111 . 99005 
10 . 96549 .99573 . 99890 . 99996 
10 : . 99724 .99989 99999 


Examination of the efficiencies stated by Walsh in [9], [12], [13] confirm the state- 
ment of Jeeves and Richards [7] that the approximation used by Walsh would 
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consistently overestimate the true efficiency. This effect is large only for small 
sample sizes. For example, Walsh gives 96 per cent efficiency for N = 5 and 
reference to Table III shows this to be the highest point on the curve. On the 
other hand, the value of .7 efficiency for a = .05 given by Jeeves and Richards 
for N = 6 to 20 is quite reasonable for N = 20 but seems too low for N = 6 to 
10. Jeeves and Richards use a randomized test, and since the efficiency depends 
greatly on the level of significance comparisons are difficult. 


4. Computation of power efficiency function. The power function of the ¢-test 
was computed for several degrees of freedom at levels of significance correspond- 
ing to those of the sign test. Computation was effected using the formulas of 
Nicholson [8]. Interpolation for fractional degrees of freedom of the t-test giving 
equivalent power to the sign test was made on z-scores (normal deviates). This 
procedure was followed since the power curves of the ¢-test are representable 
approximately as normal cumulative curves. In most cases linear interpolation 
proved to be satisfactory. However, this method was not satisfactory for 6 near 
zero. Since the power curves for the sign and t-test agree in magnitude and slope 
at 6 = 0 the limiting power efficiency may be obtained by interpolating among 
the second derivatives of the power functions. The second derivative for the 
sign test at 6 = 0, sample size N and critical value ra;2 is 


1 Ta/l2 N 4 
ae 4 as ~ NY — Ni}. 
ort (YY 1 z — N)* — NJ 


z= 


The second derivative for the t-test for vy degrees of freedom and critical value 


taj2 1S 


B(1/2, v/2)(v + —-" . 


The limiting efficiency for 6 = 0 as N becomes infinite can also be obtained from 
these derivatives and the result 2/7 agrees with the value obtained by Cochran 
[4]. 

This limit is the same for arbitrary fixed a and it appears that the limiting 
power efficiency curve for large N and fixed 6 approaches zero for increasing 6. 
No proof of this statement was obtained. However, it is not inconsistent with the 
statement of Walsh [10], [11] that the sign test and the t-test have the same power 
function asymptotically when sample sizes are in ratio 2/7. 

The values indicated for limiting power efficiency for large 6 represent ratios 
of sample sizes similar to those for finite 6. Nicholson [8] gives an expression for 
the power of the t-test with v degrees of freedom. The leading term for large 6 is 


3 
” 3) Ve 


where z = 6V/»(1 — za) and 2, satisfies 
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sp lirta- nite =e 


Using the approximation of ordinate over abscissa for the cumulative normal 
for extreme abscissa we find that z is the abscissa of a cumulative normal which 
is approximately equal to the power of the t-test for alternative 6. In a similar 
manner the normal approximation to the binomial yields z = 6+/r + 1 for the 
sign test. A fixed value of N and a determines r, a, x, and we may solve for »v. 
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THE ADMISSIBILITY OF CERTAIN INVARIANT STATISTICAL 
TESTS INVOLVING A TRANSLATION PARAMETER 


By E. L. LEHMANN! anp C. M. STEIN 


University of California, Berkeley, and University of Chicago 
1. Introduction. The notion of invariance (or symmetry) has such strong 
intuitive appeal that many current statistical procedures have the invariance 
property and are in fact the best invariant procedures although they were pro- 
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posed long before a general discussion of invariance was available. Hotelling [1], 
[2] and Pitman [3], [4] emphasized the invariant nature of certain tests and esti- 
mates. A general definition of the notion for the problem of testing hypotheses 
was given by Hunt and Stein who showed that in this case under severe restric- 
tions on the group of transformations an optimum invariant test is most stringent 
or more generally minimax with respect to an invariant loss function (see [5]). 
This result has been extended to more general decision problems and more 
general groups by Peisakoff [6]. However, these results do not. prove admissibility 
of the procedures in question unless the group of transformations is compact. 

The problem of admissibility in the case of point estimation of a location 
parameter was treated in the normal case by Blyth [7] and by Hodges and 
Lehmann [8] and for a general class of location parameter-problems by Black- 
well [9]. In the latter paper the surprising fact was brought to light that even 
in the location parameter problem the best invariant estimate may, under cer- 
tain circumstances, be inadmissible. 

In the present note we prove under conditions which are presumably unneces- 
sarily restrictive the admissibility of the most powerful invariant test for testing 
one location parameter family against another. As an example, consider the 
problem in which Z,,--- , Z, are normally distributed with unknown mean ¢ 
and variance o°. If we wish to test H: ¢ < 0 against the alternatives K: ¢ > 0, 
it was already pointed out in ({5], p. 15) that Student’s ¢-test is admissible for 
this problem. This result is quite elementary and rests on the fact that unbiased- 
ness in this case implies that the probability of rejection equals the level of signifi- 
cance for all points (¢, «) with ¢ = 0. However, this argument breaks down if we 
introduce an indifference zone and restrict our class of alternatives to K’:¢/o = 6 
where 6 is some specified positive number. 

Consider now the general problem in which one observes a random point 
(X, Y) where X ranges over an arbitrary set, Y over the real line. There are two 
hypotheses H; according to which the distribution of (XY, Y — n) is F,(= 1, 2) 
where 7 is unspecified. The problem discussed above is an example of this, if we 
take H; to be [/0 = 6;, X = >. 2,/V>27 , Y = log }°Z} and » = log o. As 
another example let (Z,; — », Z2 — n,---, Zn — ) have distribution F; under 
H, . Then we can take for X the set of differences X = (Z, — Zn, +--+ ,Zn-1 — Zn) 
and for Y the mean Z or the observation Z, , or any of a number of other sta- 
tistics. 


2. The principal theorem. Let x be a set (which for all practical purposes 
may be taken to be a Euclidean space), @ a o-algebra of subsets of X (say, the 
ordinary Borel sets if X is Euclidean), ® the real line, ® the set of all ordinary 
Borel subsets of ®, A; , Ae probability measures on @ and for each z, let F;, , Fo, 
be probability measures on ® such that for each B ¢ @, real k, and i = 1, 2 
{x | F;.(B) < k} e @. We suppose that the distribution of the random point 
(X, Y) ranging over X X “ is, for some real 7, with 7 = 1 or 2 


(1) P,((X,Y) eC) = [ dX; (x) [ aracy — n). 
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A test for the hypothesis H, that it is P,, (with 9 unspecified) is a function ¢ on 


x X & to (0, 1], @@ measurable. The test ¢ is said to be better than gp if for 
all n 


(2) E\yye(X, Y) S Exygo(X, Y) 
Exmg(X, Y) 2 Engo(X, Y), 
strictly better if (2) holds with strict inequality for some 4. go is admissible if 
there exists no ¢ strictly better than ¢g . 
THeEorREM 1. Jf Ew | Y|,£o|Y|< 0©,0<e<1, 
f } dd» 
Na $8 | gene 
"| dy + 


(x) = e} = 0, 


¢o is the test defined by 

__ Os __ 

d(r1 + Az) 
dd2 


0 1f ————_—- ( ~~ 6 
Saute” <° 


and ¢ is better than g, then ¢ — go = O a.e. (A: + A2)y where uy is ordinary 
Lebesgue measure on the real line. 


Coro.iary. If in addition all F;, are absolutely continuous with respect to p, 
then go is admissible. 


The corollary is an immediate consequence of Theorem 1. 
Proor or THEOREM 1. Putting \ = A; + Az, f(z) = ddr2o/d(A, + As) (x) we 
can rewrite the condition (2) that ¢ be better than ¢o 


[,.. @ - 1G) ane) [le - oe aPuly - 9) 


Ih if 
(3) go(z, y) ss 


(x) 


(4) 
(1 = f(a) ana) [Goo — ea, y) dFily — 0) 


f(z) 


_ | f(x) d(x) [‘ — go)(x, y) dF2.(y — n) 
(5) (z)<¢ 


+] __ S(z) an(x) / (go — ¢)(z, y) dF2(y — n) S 


Multiplying (4) by ¢ and (5) by 1 — ¢ and adding we obtain 


[et = see) ane) f le - ela, ») aPuly — 0) 
Slae 


+ (1 — e)f(x) dd(x) | (go — ¢)(x, y) dF:(y — n) 


S(a2e 


< / (1 — e)f(x) dd\(x) / (e — ¢go)(x, y) dFi(y — n) 
fliz)<e 


on ce(1 — f(x)) dr(x) / (go — ¢)(a, y) dFi(y — 9). 


S(z)2c 
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In order to derive the conclusion of Theorem 1 from (6) we shall need the 

Lemma. If X, @, ®, B are as before, p a probability measure on @, hi, ha G- 
measurable functions on % to [0, 1] with hi — he > 0 a.e. (p), ¥ an @B-measurable 
function on X X R to [0, 1], and for each x, Hiz, H2z probability measures on ® 
such that 


(7) for each B ¢ @ real kk andi = 1, 2, {x| Hi.(B) Sk} e@ 
(8) fila) dole) f | yl attuly) < ©. 


[ rate) dole) f vx, ») attiy — 0) 
(9) 


sf hale) dole) [ v(x, y) dHly — 0) for all real y 


then y = 0 a.e. (py). 
Proor oF Lemma. We can rewrite (9) 


(10) f ha(x) dete) f v(x,y +) attrely) < f hale) dole) f ve, y + 1) dHasy). 


Now 


[an J vey +1) attay) — [vend dr 


= f ams] [ven ar - f° v6e, 0) an] 


sf attady) [versa dn +f atttad [vez 9) 


< / | y | dH2.(y) 


[ dy [ v(e,y +1) diy) — [ Hx, 9) de 


fatrcn | [O° van dn — ¥en) an | 


2 fata) [vey dn = fi atta) [Hes 


-| |y| dH,.(y). 
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Integrating (10) with respect to » from —n to n and using the final forms of 
(11), (12) we obtain 


[sce) = hla) dete) [¥en an 
(13) ; 


Sf tule) doe) f \y| attrey) + f hale) dole) f | y| dtts.(y). 
Consequently, 
(14) [ trate) = haCe)] dete) [7 vx, 2) dn < = 
and for every 6 > 0 there exists n such that 
(15) [ Veata) — taCa)) doe) f va, n) dn 8 
If instead of using the final forms of (11) and (12) for all z, we use them only in 


the range h,(x) — he(x) < «and use the next to final forms when h(x) — he(x) 2 € 
we obtain instead of (13). 


[imce) — rela) doe) [ ven) da 


< dp(x) | ate / |y| dHy(y) + heo(x) | ly | attay | 


hy (z)—ho(2) <€ 


—nt+y 


(16) + eee dp(x) | atc) a aH,.y) [ v(x, n) dy 


~ / dH,,{y) / ¥(z, n) in} 
yvs0 n+y 


+ hate) {f atsad J” versa) dn + fs attady) [” 4x,» an} | 


The first term on the right-hand side can be made arbitrarily small by taking e 
sufficiently small since h,(z) — hn(x) > 0 ae. (p), O S Ai(x) S 1 (using (8)). 
For, given e > 0, the second half of the last term can be made arbitrarily smal] 
by choosing n 2 n(e) sufficiently large since by (15) 


[ dp(zx) / V(x, n) dn S 6/e. 
hy(2)—ho(z) 26 jnj2an 


—nty 


dp(x)hi(x) ai dH,,{y) [ ¥(z, n) dn 


(17) 
< | dp(z) (ze) [ _ ydBsaly). 
hy (z)—ha(z) 2D n/2sy 
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Again, for fixed ¢ this can be made arbitrarily small by choosing n 2 n(e) suffi- 
ciently large. Finally 


n/2 —n+y 
[ dp(x) hy(x) I dH,(y) / Y(x, n) dn 
hi (z)—he z)2e 0 — 


° 


</[ do(z) (x) [ ” ¥(x,n) dn, 
hi(z)—ha(z) ze = 


which is disposed of in the same way as the second half of the last term. The re- 
maining integral with y S 0 is analogous. Then, since the right-hand side of (16) 
is arbitrarily small for sufficiently large n, Y = 0 a.e. (Ay + A2)u. This completes 
the proof of the Lemma. 

To apply the Lemma to (6) we make the following identifications: 


(2) aE Fiz) <<, 
h(x) = c(L — f(z)) = oz) = (1 — ofa) 
H,, = F;, H., = F,,. 


Y=¢-— 
(ii) If f(x) 2 


h(x) = (1 — e)f(x) hex) = c(1 — f(z)) 
Ay, -_ F2, He, - Fi. 


Y¥=np-¢ 
In any case p = 4/2. The reader will readily verify that (7), (8), (9) are satisfied 
so that the theorem follows. 

A moment’s reflection shows that the origin of ® for given z is arbitrary so that 
the hypotheses Ly | Y | < could be replaced by: There exists an @-measurable 
real-valued function 7 on X such that Eg | Y — 7r(X)| < @. 

[t is seen that the admissibility of the noncentral t-test for testing [/¢ = 6 
against {/o = 6, (central in case 5) = 0) follows immediately from the theorem 
since 

E |\log= Zi| < « 


and P(>, Z; = vy Z:) = 0. 

Another example is that of testing for the same random variables ¢ = a 
against ¢ = o1. Here we may take X = )-(Z; — Z)’ and Y = }- Z;. Actually 
in this case the result can be proved quite easily by other means. Instead of 
taking for ¢ the usual least favorable sequence of a priori distributions which in 
the limit is invariant, we may, if oo < o take in H the a priori distribution 
P(¢ = a) = 1 where ais any constant, and in K a normal distribution with mean a 
and variance n(1/oi — 1/09). The Bayes solution is seen to be the F-test which 
is therefore admissible. (For details see [10]). 
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We can also consider the general linear hypothesis with no unknown means 
as nuisance parameters. For brevity we use the terminology of [5]. In the canon- 
ical form we have U, --- Um, Vi; -:+ Vx independently normally distributed 
with EU, = v;, EV; = 0, E(U; — vs)’ = EV} = o° where o’, ¥; are unknown 
and we want to test the hypothesis that all vu; = 0 say, against dvi > yo". A 
sufficient statistic is (U, --- Um 2V3)- The problem is invariant under rotation 
of the vector U,,--- , Um and multiplication of all U;, V; by the same con- 
stant c. Since the rotation group Gp» possesses a finite invariant measure, any test 
invariant under Gy and admissible among all tests invariant under G» is admissible. 
Thus, in proving the usual F-test admissible we may restrict our attention to 
tests depending only on (>0U? , >. V3). Under multiplication by ¢ this goes into 
(0 U; ,c)V3). Taking X = )(U{/S"Vi and Y = log > V5 , applying Theorem 
1 and the optimum property of the F-test among all those based only on }>U%/ 
> V; , we obtain the admissibility of the usual test: Reject Ho if Ui = V5. 
The same argument applies to the problem of testing Ho: Dov; < yc’ against 
H, : ov = yo" with y2 > 1 - 
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A NOTE ON DODGE’S CONTINUOUS INSPECTION PLAN! 


By GerALD J. LIEBERMAN 
Stanford University 


1. Summary. In his first continuous sampling plan [1], H. F. Dodge showed 
that his procedure guarantees an Average Outgoing Quality Limit (AOQL) with 
the assumption that the process is in a state of statistical control. It is proved 
in this paper that the Dodge procedure, without the assumption of control, 
guarantees an AOQL, although different from that specified by Dodge. 


2. Introduction. In 1943, Dodge published a continuous sampling plan [1] 
in the Annals of Mathematical Statistics. The procedure, as stated by Dodge, 
is as follows: 

“‘(a) At the outset, inspect 100 per cent of the units consecutively as produced 
and continue such inspection until 7 units in succession are found clear of defects. 

““(b) When 7 units in succession are found clear of defects, discontinue 100 
per cent inspection, and inspect only a fraction 1/k of the units, selecting indi- 
vidual sample units one at a time from the flow of product, in such a manner as 
to assure an unbiased sample. 

“(c) If a sample unit is found defective, revert immediately to a 100 per cent 
inspection of succeeding units and continue until again 7 units in succession are 
found clear of defects, as in paragraph (a). 

“(d) Correct or replace with good units all defective units found.” 

In his paper, Dodge studied the properties of this plan, and presented equations 
and charts for determining the Average Outgoing Quality Limit (AOQL) as 
functions of the parameters k and 7, under the assumption that the process is in 
a state of statistical control. A production process is said to be in statistical con- 
trol if there is a positive constant p S 1 such that, for every item produced, the 
probability that it is defective is p, and is independent of the state (defective 
or nondefective) of all the other items produced. 

The purpose of this paper is to show that the Dodge procedure guarantees 
an AOQL whether or not the process is in a state of statistical control. It is 
proved, without the assumption of control, that for a given k and 7, an AOQL 
is guaranteed. In fact, AOQL = (k — 1)/k + 2). 

For a given k and i, the above value of the AOQL is always higher than that 
obtained with Dodge’s equations. However, it is achieved when the process 
alternates between producing all defective items during partial inspection and 
producing all nondefective items during 100 per cent inspection. As Dodge points 
out, [4], this worst possible behavior for the process is not a realistic model. The 
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results, therefore, should not be interpreted as implying the existence of practical 
limitations in the use of the plan as Dodge recommends; especially since the 
AOQL is itself an upper bound, and the actual outgoing quality is usually smaller. 

It is assumed throughout the paper, that observations are drawn at random, 
and each defective item found is replaced by a nondefective. The definition of 
the AOQL given in Section 3 is consistent with that given by Wald and Wolfowitz 
[2], and consequently many of the comments presented in their section on Funda- 
mental Notions (pp. 30-32) are pertinent to this paper. 


3. Method of proof. Define 
v; number of defects being passed in the jth cycle (j = 1, 2, --- , m). A 
cycle is the period where partial inspection begins, to the time a defec- 
tive is observed. 
an integer such that when the process is on partial inspection, one out of 
k items is inspected. 
number of consecutive items free from defectives before partial inspection 
can begin. 
number of items undergoing 100 per cent inspection frora the end of the 
(j — 1)st cycle until 7 consecutive items are observed free from defects; 
and such that 7; + 7 is the total number of items being inspected 100 
per cent from the end of the (j— 1)st cycle to the beginning of the jth 
eycle. T; 2 6. 
V number of groups of k items on partial inspection in the jth cycle. N; = 1. 


Define AOQL = smallest number LZ with the property that for every process 
the probability is zero that 


m m 
ps v; z v;/m 


j=l j=l 


(1) lim sup =" = lim sup =" > L 
ia 2» (T;+i+kN) °™ p> (T; +i +kN;)/m 
j=. j= 


To obtain the AOQL it is evidently sufficient to consider the special class of 
processes where the number of defectives in every segment on partial inspection 
=1. Since V; 2 1, 7; 2 O, we have for any such process 


m m 
De v;/m de v;/m 
= < lim 7“ 


(2 lim sup . 
) p= ak 


nw" S(T; +i + KN;)/m 
j=l 
(We shall show that the limit on the right-hand side exists and is equal tok — 1 
with probability one.) 
It is important to note that the random variables v; are dependent. However, 
if it can be shown that 


E(v; | mse o* v;-1) = k — 1 
> Ev; mag? 9 vj) 


< @, 
P 


j=l 
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the Strong Law of Large Numbers [5] can be applied to the numerator of the 
right-hand side of the inequality of (2) and we then obtain 


k-1 
< OG 4 < —— 6 
— s k+2 


(3) 


In fact, if the process is such that the proportion of defective items is zero 
whenever items are on complete inspection, and the proportion of defective items 
is | whenever items are on partial inspection, then 


m 
>. v;/m 


. =1 
lim sup ———— 


2. (T; + i+ kN;)/m 


j=l 


m2 


Hence it follows that AOQL = (k — 1)/(k + 2). 


4. A lemma on the boundedness of (v5 |); , njo,--- ). Let nj, be the num- 
ber of defectives in the ith group of k items after the start of the jth cycle, 7 
1, 2, --- . By definition of the expected value of a discrete random variable 


Nj 


ek a r Ny; IbLie ; 
E(u; | nj, nj, °°°) = qe (na — 1) +(1 = Met) 8 (ny + nj» — 1) 


+(1 - ") € - 2) (“) (nji + nyo + nj3 — 1) +--- 


The sth term of the right-hand side of 
(1 — 1/k)** (sk)' so that 


equation (4) is bounded by 


(5) E(v" Nyi, Nj2, °° :) < > (j — 1/k)*" (sk)’. 


s=1 


For any finite r, the right-hand side of inequality (5) is finite and independent 
of the nj; . 


5. Theorems and proofs. 
THEOREM 1. 


E(vj;|u,°°: , va) =k —1. 
Proor. Define 
Ev 


i | Mjr1y Mj2,°°*) = Gl(nyi,n 


J29 
Nj 


; ‘ Nir\ , 
(6) g(nji,Mj2,°°*) = # (nj, — 1) +(1- EY Iya + g(Nj2, Ny3, 


\ 








DODGE’S INSPECTION PLAN 


Using the recursion formula a finite number of times, it follows that 
Pe ’ 


¢(nj, Nyj2, °° -) _ (1 cia ry my + (: Es =) Nje2 T ¢ ons *) (1 - =) N33 
I: . k I: 
66-20-82 
ofl 96-996-9]fceinee 2 


Let p approach infinity in equation (7). From the results of Section 4 
(nj; , Nia, ***) 18S bounded for all 7. Also, 


jim (1 - ¢ -t)...(1- +») « Ounn 
pe ‘ ‘ v 
er aa) (1 - 2) n +(1-™)n 
P\Nj1, Nj2,5 k bj k 32 
(1-8) (Bent > 


Dividing ¢(nj1, nje, +--+) by k — 1 in equation (8) 


(8) 


o(nj1, Myo, °**) Nj2 


k-1 t tJ) k 


+ (1 -™)/( 2) (2) + : 


From probabilistic considerations, the right-hand side of (9) sums to 1, since it 
is just the probability of obtaining one defective item in the jth cycle, which is 
1. Consequently, 


(9) 


(10) g(nj1 > hye, +++) = E(v; | Nj1 5 Nye; +++) = k — 1. 


The proof of the theorem follows from the identity 
(11) Elu;|o,-++ ,vsa) = ELE fej | a, +++ jars na, mje} lor, +++ , vs). 


sut the probability distribution of v; | ui, --+ , usa, Nj, Nj, -> is a function 


only of nj; , nye , +++. Hence 


32 3 


(12) {Vs | Ur, V2, °°* pVjr,y Mj, Nye, ses = Elv;|nja,ny, oof =k—] 


so that E(v; | 1, +++ , vj.) = k — 1, and the theorem is proved. 
THEOREM 2. 


> E(v3 |", ‘ + 5 U1) on. 
j=l - 
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Proor. Letting r = 2 in equation (5), it follows that E(v} | nj, mj, ++) < 
2k° — k*. Once again, using the identity 


2 | 2) 
E(vj | 11, +++ , Us) = E[E\v; i, eee a ee ee i] 


and the fact that the probability distribution of vj | v1, --- , vj, M1, Nye4°°° 
is a function only of nj, , nj, --- , it follows that 


(13) E(u} | 1, -°* , via) < 2k? — KF. 
Consequently 


2 72 | 
5 Sle eee » Vj~1 


2 
j=l J 


< 0, 


and the theorem is proved. 
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ON THE POWER OF A ONE-SIDED TEST OF FIT FOR 
CONTINUOUS PROBABILITY FUNCTIONS' 


By Z. W. BrrnBauM 
University of Washington and Stanford University 


Summary. If F(x) is a continuous distribution function of a random variable 
X, and F,(x) the empirical distribution function determined by a sample X,, 
X2,--:, X,, then the probability Pr {F(2) = F(x) + e for all x} is known 
{1] to be a function P,,(e), independent of F(x). A closed expression for P(e) 
and a table of some of its values were presented in [2]. In the present paper 
P,(e) is used to test a hypothesis F(x) = H(x) against an alternative F(x) = 
G(x). The power of this test is studied and sharp upper and lower bounds for 
it are obtained for alternatives such that sup_.<:<42{H(x) — G(x)} = 6, with 
preassigned 6. The results of [2] are assumed known. 
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1. Description of the test; integral formula for its power. We consider the 
class (F) of cumulative probability functions F(x), continuous for all real x, 
and increasing for all x such that 0 < F(x) < 1. (The assumption that the F(z) 
are strictly increasing is made for convenience of argument only. Theorems 1 
and 2 remain valid if one assumes the F(x) nondecreasing.) We will assume 
H(x) ¢ (F), G(x) e (F), and test the hypothesis F(x) = H(x) against the alter- 
native F(x) = G(x) by the following procedure. 

To have a test of size a, for sample size n, we use the value ¢€,,.2 from Table 1 
in [2], obtain an ordered sample X,, X2,--- , X, of X, determine the empirical 
probability function F(x), and reject H(z) if and only if the inequality 


(1.1) H(x) < Fa(x) + €n.a 


fails to hold for all real z. 
The power of this test is the complementary probability to 


P = Pr {H(x) < F(x) + €n,2 for all x; G(x)}. 


One verifies easily that (1.1) is satisfied for all x if and only if it is satisfied for 
all sample points X;, 7 = 1, 2,---, m, that is if and only if 

i—1 ay 
(1.3) H(X,;) < + €n,¢; for 7 


n 


We have, therefore, 


P = Pr<H(X,) < t= + €x.0 for 7 2,--- ,n; G(x)> 
\ r 


= Pr<(X;< ge (— + wo} for 7 +++, n; G(z) \ 


= Pr<{G(X,) <G pe = + ena) | for i 2,-++,n;G(z)>. 
? ) 
Using the notation 
L(V) = GlH“” (V)] for0 <V<1 


L(V hi “AV forV < 
(1.4) L(V) lim L(V) or V <0 


L(V) = lim L(V) for V = 1. 
l1>vVv—l 
(The meaning and some applications of the expression G[H‘~” (V)] are discussed 
in [3], Sections 3 and 4) and keeping in mind that U = G(X) has the rectangular 
distribution R(u) in (0, 1), we conclude 


. ) 
P= Pr dU; <L (: — : + cn) fori = 1,2,---, n; R(u) > 


\ 
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where (7, , Us, --- , U, is an ordered sample of U. The joint probability density 
of U,, Us, --- , U, being equal to n/ forO s U,; S U2 S +--+ S U, FS 1, and 
zero elsewhere, this can be written 


~L(«) L((1/n)+e L((i—1) /n+e) 
hamho ds wk a 
(15) “0 JU, d 


“U;j-1 


where ¢€ is written in short for €,.4. 

If H(x) = G(x), then L(V) = V for0 < V < 1 and (1.5) reduces to formula 
3.3) in [2]. If H(x) and G(x) are given, it may be possible to evaluate P, and 
hence the power | — P, from (1.5) by quadrature, or one may compute it by 
numerical integration. In the general case, however, it is possible to derive from 
(1.5) inequalities for the power, as will be shown in Sections 2 and 3 


2. Lower bound for the power. For given hypothesis H(z), we consider al- 
ternatives G(x) such that 


(2.1) Lu.b. {H(x) — G(z)} = 6 
and 
(2.2) H(Xo) — G(Xo) = 6. 


For intuitive reasons one may expect that under these restrictions the power 
of our test will be close to its minimum when G(z) is close to the function 


H(Xo) — 6 for z < X, 
G*(z) = anil 
i1ur Xo < 2 
To verify this conjecture we consider 
i (H(Xo) — 6 for 0 < V S H(X)) 
L*(V) = G*|H’ '(V)] = 4 
\l for H(X)) < V <1, 
and write 
H(Xy) = Vo, H(Xo) — 6 = Up 
so that, by (2.2), we have 
(2.3) L(V) = G[H™'(Vo)| = G(Xo) = Us, 


and 


(Us for0<VSV, 
L*(V) = 4 


for Vo < V< 1 
Let j be the greatest integer contained in n(Up + 6 — €), 


(2.4) j = [n(Uy + 6 — &)] = [n(Vo — 6]. 
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We have 


i— 1 sf : ae : 
L(i—1+4.)¢ L(Vo) = Ug fori — 1 Sj. 
n 
This shows that replacing in (1.5) the function L by L* in the upper limits of 
integration will not decrease these limits, and hence 


Uo Uo pl 1 
(2.6) Ps n! [ | | a | dU, +++ AU j42dU 34: -++ aU,. 
vu; J v 


“0 271 Un—i 


In case 7 S 0, all upper limits of integration in (2.6) are 1 and we have the 
trivial inequality P <= 1. 
An easy induction shows that the integral in (2.6) is equal to 


J 


1-2 (") sa - va" = Tv (9 + 1n — 9) 


i=0 


where 7, is the incomplete Beta function, so that (2.6) becomes 


Pp 


j 
n ‘ ne i , 
Psi- & (*) via — U,)"* = Ici + 1,n — J). 
s=0 \b 
Summarizing we obtain 
TrHreoreM 1. [f H(x) and G(x) are continuous distribution functions which satisfy 
(2.1) and (2.2), then the power of the test described in Section 1 has the lower bound 


(2.8) 2 (”) Usd — U)""* = 1 — Teli + In — J) 


1 


=f) 


where 
(2.81) Uo = H (Xo) = 6, j = [n(H(Xy) = €)]. 


This lower bound, as a function of H(Xo), 6, €, cannot be improved since, 
for any given //(x) in (F), Xo, €, one can construct a G(x) arbitrarily close to 
G*(zx). 


3. Upper bound for the power. It seems plausible that, for given H(x) and 
under the restrictions (2.1), (2.2), the power of our test will be close to its maxi- 
mum when G(z) is close to the function G**(x) = max [H(x) — 6, 0]. We con- 
sider L**(V) = G**{H~'(V)]| = max (V — 6, 0) and observe that when (2.1 i; 
(2.2) are satisfied we have 


L(V) = L**(V) forO<V <1 
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so that the integral in (1.5) will not be increased if L is replaced by L** in the 
upper limits of integration. Denoting by r — 1 the greatest integer contained in 
n(l — e + 8), 


r—1= [n(l — € + 8)] 


we obtain, therefore, the inequality 


. ores [ [ 
Uy Ur-1 Ur Un-1 


dU, +++ dU,4, dU, --- dU2dU, fore 2 6 


»€—5 


P2nl | 


(3.1) 
and 
(3.2) P20 for e < 6. 


To evaluate (3.1) we observe that the expression on the right side is exactly 
that of (3.3) and (3.4) in [2], except that ¢ is replaced by e — 6. This, together 
with (3.0) in [2], leads to the following theorem. 

THEOREM 2. If H(x) and G(x) satisfy (2.1) and (2.2), then the power of the test 
described in Section 1 has the upper bound 


(n(l—e+38)] n a. i i-1 
1 — Pyle — 5) = (e — 8) - (")(a-«+5—4)*(e-a+!4) 


for e = 5, 


and the upper bound 1 for e€ < 6. 
These upper bounds cannot be improved since, for any given H(z) in (F), 
6, €, it is possible to construct a G(x) in (F) arbitrarily close to G**(z). 


4. The case of n large. The lower bound (2.8) for the power may be approxi- 
mated, for n large, by the normal probability integral and, in view of 


j=([n(Uot+s-—O) =n +n6-—e€)-m, OSm< 1 
is approximately equal to 
Fa ind 6-0) —n- dann} 
1 Vtod—U) | . 
= cn de 
“aT J—x 


; §(8—-e)— nab 
1 Venta er _ 

> — é ds. 

V2 Lo 


It was shown by Smirnov [4] that ¢n,.is asymptotically equal to ~/(1/2n) (log 1/a). 
Substituting this in (4.1), we obtain for the lower bound oi the power the asymp- 
totic expression 


1 se 
aaa alhans Cie l/a)—n-i 
(42) 1 Vrodat,) (Mev (/2)log/a)—n-4) ai 
= == e€ ds. 
V/ 2x eae 
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If only 6 is known, but not U> , then (4.2) may be replaced by its minimum with 
regard to Up 


l ' ‘ipuaieateininal 


= et) ds. 
V 25 — 30 
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Abstracts of papers presented at the Washington meeting of the Institute 
April 29-May 1, 1953 


|. Optimum Sample Sizes for Choosing the Largest of (k + 1) Means Using 
Minimax Methods. Paui N. Somervitie, University of North Carolina. 
Assume we have (£ + 1) normally distributed populations with unknown means a 2 
a, 2 +--+ 2 ay. It is decided to choose N individuals from these populations in such a way 
that the expected value of their total is as large as possible. A preliminary sample of % 
is taken from each population with the object of deciding from which population the fur 
ther sample of size V should be taken. N(a; — ao) is then the loss involved in choosing the 
population with parameter a; . Assume the cost of the sample is a linear function of the 
sample size. Using results previously given - ” shown that the minimax m is proportional 
to VS, Explicit results are given for k = 1, 2,3, 4,5, for 2 one-stage preliminary sample. 
For the case k = 2, results for a two-stage pas are given. In the first stage, samples of 
n, are taken for each of the three populations. In the second stage, samples of no are taken 
from each of the two populations with the largest means in the first stage. If 3x, — 2n. = 
3n, then it is found that the maximum expected loss is less for the two-stage sample than 
for the one-stage sample provided n;/nz is greater than .37 (approximately). The optimum 
ratio in this sense is found to be n,/nz = 1.2 (approximately). If for n)/nz = 1.2, the maxi- 
mum expected losses are equated by a reduction in the total preliminary sample size, a 
saving of 6.6 per cent over the one-stage procedure in the preliminary sample size is effected. 


2. The Correspondence Between Two Classes of Balanced Incomplete Block 
Designs. W. S. Connor, National Bureau of Standards. 


Let 2,(n) denote the problem of constructing the design with parameters » = a(n + 1) 
=}(n+1)(n+2),k =n,r = n+ 2, andd = 2; and let Y.(n) denote the problem of 
constructing the design with parameters v = b = }(n + 1) (n+ 2) +1,r =k =n 4 2, 
| 


and X = 2, (n > 1). It is shown that 2,(m) has a solution only if Z.(m) has a solution. 


3. A Finite Frequency Theory of Probability. A. H. CopeLanp, Sr., University 
of Michigan. 


This paper develops a new theory of probability, the finite frequency theory, in which 
probabilities are regarded as physical hypotheses. Associated with each probability is a 
system of predictions which can be tested by experiment. An experiment may either con 
firm or disagree with a given prediction. This theory of probability produces some com- 
plications in formal logic. However the theory and its associated deductive and inductive 
logics are in better agreement with modern scientific reasoning than the conventional 
probability theories and the conventional logies. 


4. Characterizations of Complete Classes of Tests of Some Multiparametric 
Hypotheses, with Applications to Likelihood Ratio Tests. ALLAN Brirn- 
BAUM, Columbia University. 


Let H7/, be a simple hypothesis on a density function of the form 
he ‘ ; ; 
pele) = exp {go + Zigiti(e) + tole) 


Let 7, the range of the sufficient statistic t = (4, --- , te), be independent of ft. Let V’ 
be the class of nonrandomized decision functions 6(t} such that each 6(t) = 0 just on the 
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intersection of some open convex set with 7’. Let V be the class of randomized decision 
functions 6(f) each of which coincides with a member of \’ except on a set of measure zero. 
Then under certain assumptions it is shown (a) that V’ is essentially complete, and (b) 
that V is complete. Under further assumptions, chiefly requirements that the alternative 
hypothesis be sufficiently general, it is shown (a) that V’ is minimal essentially complete, 
and (b) that V is minimal complete. Applications are made to likelihood ratio tests of Ho , 
which are shown to be included in V’, to discrete distributions of the form p,(e), and to tests 
of composite hypotheses on p,(e). 


5. Confidence Regions for the Location of the Vertex in Quadratic Regression 
(Preliminary Report.) Davin L. WaAu.Lacer, Princeton University. 


Procedures are considered for obtaining confidence regions for the location of the vertex 
of a regression surface which is a quadratic function of k “determining” variables x, , --- , 
z, from a sample with normal homoscedastie error on the dependent variable only. The 
hypothesis that (aj, --- , zs) is the vertex of the regression surface is a general linear hy- 
pothesis; a set of k linear homogeneous equations in the regression coefficients in which 
the coefficients in the equations are linear functions of the |z}}. For any general linear hy- 
pothesis of this form, a confidence region for (zi, --+ , tk) is obtained by the standard (F) 


” 


test. This region possesses several ‘“‘optimum”’ properties, but is unsatisfactory for prac- 
tical applications. If each of the k single linear hypotheses making up the general linear 
hypothesis is tested separately by the standard (¢) test, k different confidence regions, 
whose shapes are usually hyperboloids, aresobtained for the (z; , --- , zz). The intersection 
of these is a confidence region for (x} i OO zz) with bounded risk. Approximations to this 
intersection region by parallelepiped and polyhedra are discussed. Requirements for usable 
confidence regions are discussed and proposed procedures are rated primarily by these re- 
quirements. 


6. The Noncentral Wishart Distribution. (Preliminary Report.) A. T. James, 
Princeton University. 


The noncentral Wishart distribution, as T. W. Anderson showed, is the central dis- 
tribution multiplied by a symmetric function, ¥, of the latent roots a; of the matrix 


ZT=-'A 


where = is the k X & variance covariance matrix of the parent normal k-variate distribu- 
tions, J is the A X k matrix of sums of squares and products of population means about 
their averages and A is the sample variance covariance matrix. It is shown that y is the 
average of an exponential function in several variables over the orthogonal group. The 
exponential function is an eigen value of the Laplace operator A, and A commutes with 
the operation of averaging over the group. Hence Ay = y. If A is expressed in terms of the 
latent roots a; a system of second order partial differential equations for ¥ is obtained, 
which can be solved in power series for k < 3. For k > 3, the partial differential equations 
vield an effective system of recurrence relations for the coefficients of the multiple power 
series. 


7. On Time-Dependent Waiting Line Processes. A. Bruce CLARKE, University 
of Michigan. 


\ single-server waiting line process with Poisson distributions on the input and service 
times is considered. The parameters A and yu of these Poisson distributions are assumed 
to be arbitrary nonnegative functions of time. An exact formula for the transition prob- 
abilities, P,.,(t), for the line to have length n at time ¢, given that it had length » at time 0, 
is found. The formula involves a function which is defined as the solution of a certain Vol- 
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terra type integral equation; this can be determined explicitly for the special case in which 
the ratio of \ and u is independent of time, and numerically otherwise. The general method 
of solution is to use the Kolmogorov equations to obtain a hyperbolic partial differential 
equation for a modified characteristic function of the distribution, thus reducing the 
problem to a boundary value problem that can be solved by standard methods. The for- 
mula for P,,,(t) is used to discuss various properties of the distribution, with special atten- 
tion to nonstationary processes. 


8. Some Estimates Which Minimize the Least Upper Bound of a Probability 
Together with the Cost of Observation. H. 8S. Konisn, University of Cali- 
fornia, Berkeley. 


When an investigator aims primarily at insuring a high chance of getting a point estimate 
Ty of an unknown parameter point @ within a reasonable distance a of 6, the loss function 
proportional to the distance d(7'y , 6), which is generally used implicitly or explicitly, is 
inappropriate, and should be replaced by W = lif d > @ (or >a, in direction7,i = 1, 2, ---) 
and = 0 otherwise. Somewhat similar considerations already arose in the theory of con- 
fidence intervals. Following in part a procedure of Wolfowitz (Ann. Math. Stat., Vol. 21 
(1950), pp. 218-230), the present paper bases the choice of estimating interval [Ry , Sy] 
on the probability of covering 6, Pr{d(Ry , 6) > @’}, and Prjd(Sy , 6) > a”}. When the cost 
of observation plays a role in the selection of estimates, it usually enters in the form of its 
mathematical expectation, but other ways may be considered. The paper investigates in 
detail the (highly manageable) case of a normal variate with known variance. In several 
instances it obtains explicit results, which allow suggestive comparisons with classical 
methods as to sample size, unbiasedness, ‘‘shortness,”’ etc. A still different point of view 
is briefly formulated. 


9. On a Multivariate Analogue of Student’s t-Distribution, with Some Tables 
for the Bivariate Case. Coartes W. DuNNeETT AND MILTON Soset, Cor- 


nell University. 


We consider the joint distribution of p variates t; = z;/s,i = 1,2, ---,p. The z; have a 
joint multivariate normal distribution with means 0, variances o? and correlation matrix 
(pi;); ns?/o? has a chi square distribution, independent of the z; , with n degrees of freedom. 
The joint density function of the ¢; is given by | A |! (n + p/2) {(mr)?/?P(n/2) (1 + 
Di; aijtit;/n)("*?/2}-1, where | A | is the determinant of the positive definite matrix 
(ass) = (p:;)~!. This reduces to the Student t-distribution when p = 1. For the bivariate 
case (p = 2), the following results were obtained: (a) an exact expression in the form of a 
finite series for the probability integral from (h, k) to (~, ), (b) an asymptotic series in 
powers of n~ for this probability integral, (c) an asymptotic series in powers of n~ for the 
value of h = k for which the probability integral is equal to an arbitrary specified value, 
and (d) tables of the probability integral and certain percentage points for the special 
cases h = k and p = +}, where p is the correlation between x, and zz . These tables are re- 
quired for certain multiple decision ranking problems involving three population means 
(Ann. Math. Stat., Vol. 24 (1953), p. 136). (Research sponsored by Air Research and De- 
velopment Command.) 


10. On the Completeness of Classes of Bayes’ Solutions. Lucien M. LeCam, 
University of California, Berkeley. 


The terminology used is that in Wald’s book, Statistical Decision Functions, John Wiley 
and Sons, 1950. It is shown that assumptions (3.3) and (3.4) of the preceding book can be 
replaced by the following weaker assumptions. (1) F is a function of an element w in some 
arbitrary index set 2. (2) The space D‘ of terminal decisions is a compact metrisable Haus- 
dorff space. (3) The weight function W depends on w, d‘ and possibly on the indices of the 
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random variables actually observed. Moreover, infg' p‘.s1,....% W(w, dt; 8, -°+- , 8%) > 
—«. (4) For each w € Q and each set {s; , --- , se}, the function W is lower semicontinuous 
on D‘. The assumption of separability (3.2) loses part of its meaning and can be dropped. 
Assumptions (3.1), (3.5) and (3.6) can also be weakened but not very significantly. Under 
these weakened assumptions, the class of admissible decision functions is complete and 
theorems (3.5), (3.7), (3.8), (3.9), (3.17) and (3.18) remain true. In theorems (3.17) and (3.18), 
the class D, of decision procedures with bounded risk functions can be replaced by the class 
D of all decision procedures. 


11. Identification and Estimation of Linear Structures with Symmetric Errors. 
T. A. Jeeves, University of California, Berkeley. 


Consider a vector random variable X with n-components having the following structure ‘ 
(i) X = ¢ + U; (ii) € is a vector random variable such that BE = A where Bisas Xn 
matrix of constants and A is a constant vector of s components; (iii) and U are independent 
and (iv) the distribution of U is symmetric about some known point in n-dimensional space. 
A necessary and sufficient condition for the identifiability of A and the column space of B 
is that the distribution of — should not be symmetric about a point. An estimate based on the 


sample characteristic function is given which converges almost surely when the parameters 
are identifiable. 


12. The Cramér-Smirnov Test in the Parametric Case. (Preliminary Report.) 
DonaLp A. Daruinc, Columbia University. 


Given a set of n data (independent, identically distributed random variables) 
NX, , X2, --: , Xn we wish to test the hypothesis H that their common continuous cdf is 
F(z; @) for some (unknown) value of the (real) parameter 6 € 2. In modifying the usual 
chi square test where ar auxiliary parameter is to be estimated we consider, following a 


~~ 
suggestion of Cramér, the test function W%, = vi | (F,(x) — F(x; 6,))*dF (x; 6.) where 
—— 


F,,(z) is the empirical cdf of the data and 6, is some estimate of 6. Two essentially distinct 
cases arise. a) If 6, is a superefficient estimator of 6 W*, has the same limiting distribution as 
in the nonparametric case—the Smirnov distribution. b) If F(z; @) satisfies Cramér’s con- 
ditions for regular estimation and an asymptotically efficient unbiased estimator 6, (the 
maximum likelihood estimator essentially) exists we have the following result: let f = 
OF /dz, o? = limnon Var (0,) = E{ (0 log f/d6)?}~ and put u = F(z; 6), h'(u) = 0d log f/00 


’ 


1 
0 < us land let a, =V/2 | h(u) sine ku du,k = 1,2,-+- ,G(A) =1+ DP (raz) /(1 — X/2k?). 
/0 


Then the limiting characteristic function of W% is ~/2it ese +/2it(G(2it)) 4. The method of 
proof is by a consideration of a Gaussian process following an idea of Doob. Unlike the 
corresponding nonparametric case the test is not distribution free, and in general the limit- 
ing distribution of W%, will even depend on the true value 6. In important special cases in- 
cluding that where 6 is a scale or location parameter the function h(u), and consequently 
the distribution of W%, , does not depend on 6, however. The theory can be extended to the 
case of several unknown parameters, and it is possible to discuss the corresponding Kolmo- 
goroff test function using these methods. (Research sponsored by Air Research and Devel- 
opment Command of the Air Force.) 


13. Asymptotic Solutions of the Compound Problems for Two Completely Speci- 
fied Populations. James F. HANNAN AND HERBERT Rossins, Catholic 
University of America and University of North Carolina. 


Let v be a vector of arbitrary dimensionality and let F(v,0) and F(v, 1) be any two dis- 
tinct distribution functions. Let X, , --- , X, be independent random vectors such that X, 
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has the distribution function F(v, @;). Let XY = (X,, --- , X,) and let 6 = (0, --- , Ox). 
It is required to decide for each 7, on the basis of X and the known distribution functions 
F(v,0) and F(v, 1), whether 6; is 0 or 1. The loss of the compound decision d = (d, , «++ , dy) 
is taken to be W(d, 6) = n- Sf(ae,(1 — d:) + b(1 — 6;)d,), a and b being positive constants 
determined by the empirical background of the problem. This problem was previously con- 
sidered for the special case of V(—1, 1) and N(1, 1) (Herbert Robbins, ‘‘Asymptotically 
subminimax solutions of compound statistical decision problems,’’ Proceedings of the Second 
Berneley Symposium on Mathematical Statistics and Probability, University of California 
Press, 1951, pp. 131-148). The present paper constitutes a generalization and amplification 
of results obtained there. 


14. On the Estimation of the Mean Life of a Radioactive Source. (Preliminary 
Report.) Ricnuarp F. Link, Princeton University. 


Procedures are given for estimating the mean life of a radioactive source assuming: 
i) the individual times at which particles disintegrate are recorded for a time interval 
(To , T:), ii) the number of particles which disintegrate in each of K nonoverlapping time 
intervals is recorded. Methods for obtaining exact confidence intervals for the estimate of 
the mean life are presented for two of the procedures. Asymptotic variances are derived for 
all of the estimates. Comparisons of the asymptotic efficiency of the various methods are 
given. For two of the methods comparisons of the expected lengths of confidence intervals 
for the mean life, given that n disintegrations are observed, are presented for n = 10, 25. 


15. The Use of the Questionnaire to Compare Two Populations for the Pur- 
pose of Improving the Course Content in a Mathematics Course for Busi- 
ness Teachers. Mary Goins, Marshall College. 


A study was made to determine the relative amount and kind of mathematics required in 
business teacher training programs in professional schools of business as compared with such 
programs in teachers colleges. Random samples from the two populations were drawn. 
An appropriate questionnaire was sent to the administrators of the institutions. Results 
were tabulated and statistical computations made. On the average, professional collegiate 
schools of business were found to have stronger required courses in mathematics than 
teachers colleges having the same type of curriculum. Changes in course content based on 
the computed statistics are discussed in this paper. 


16. Minimax Decisions Regarding Mean of a Normal Variable with Unknown 
Variance. MAaninpRA N. GuosH, University of North Carolina. 


In a recent paper by the author (Sankhyd, Vol. 13) Wald’s decision problem has been 
generalized to the case of unbounded weight function W(F, d‘) and locally compact space 
D¢ of decisions. In this paper some applications of this method to the case of decisions re- 
garding the mean of a normal variable, in the fixed sample or sequential procedure have been 
made when the variance is unknown. 


17. On Two-Stage Estimation Procedures. 8. G. Guuryr aNp HerBErT Ros- 
BINS, University of North Carolina and Institute for Advanced Study, 
Princeton. 


Let P; ,i = 1, 2, be two populations and let 6; be a parameter connected with P; . Let 
t:(n) be statistics (of finite variance) based on samples of size n from P; and such that 
&t,(n) = @;. Samples of sizes n; from P; yield the unbiased estimate t,(m) — t2(n2) of 
6, — 6. The total sample size N = mn + n» being prescribed, it is desired to partition N 
so as to minimize the variance of t:(m,) — t2(m2). When the variances of the t;(m) are un 
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known, a two-stage sampling procedure is utilized. Some particular investigations of such 
problems have been made by others (e.g., J. Putter), but this paper considers the asymp- 
totic behavior (as N — ~) under general conditions, and also the situation for finite N in 
special cases. (This work was supported by the U. 8. Air Force under contract AF 18(600)- 
83.) 


18. Estimation of the Location Parameter in the Structural Problem of Ney- 
man. T. A. JEEvEs, University of California, Berkeley. 


Consider a pair of random variables (X, Y) having the following structure: (i) X = 
E+ U, Y =n + V; (ii) (€, 9) are random variables such that — cos 6 + 9 sin 6 = p for 
certain constants @ and p (—2/2 < 6 S w/2); (iii) (U, V) are independent of (&, 7). Let 
|Xm, Ym} be a sequence of random variables such that each pair has the above structure 
and is independent of every other pair. The basic problem is to use this sequence to construct 
a pair of statistics @y and py which will converge in some sense to @ and p, respectively. 
If U = U,+ U2zand V = V, + V2 with (U;, , V:) jointly normal and independent of (U2, V2) 
and U, independent of V. , then under the assumption that — and n are not both normal, 
Neyman (‘‘Existence of Consistent Estimates of the Directional Parameter in a Linear 
Structural Relation Between Two Variables,” Ann. Math. Stat., Vol. 22 (1951), pp. 497-512) 
has given a consistent estimate of 0* (@* = 6 if @ # 7/2, 0* = Oif 6 = r/2). To date no esti- 
mates of p have been given. In fact, even assuming the means of (U; , V;) known, without 
further restrictions on (U,, V2), p is not identifiable and hence no consistent estimate 
exists. Using the sample characteristic function, estimates (Ox, Px) have been obtained 
which converge almost surely to (6, p) under the assumption that U, and V2 have svm- 
metric distributions. In a similar manner, estimates have been obtained for the case in 
which the first moments of U». and V2 exist and are known. 


19. On the Distribution of the Sum of the Roots of a Determinantal Equation. 


K. C. S. Pruuar, University of North Carolina. 


In four different situations of testing hypotheses relating to p-variate normal popu- 
lations we run into the roots (all nonnegative) of the determinantal equation in 
8:| S; — @S.| = 0 where S;(p X p), S2(p X p) are sample matrices such that almost every - 
where S, is at least positive semidefinite of rank g (Sp) and S, — S, (and hence neces- 
sarily also S2) is positive definite. Under the null hypothesis, the joint distribution of the 
q positive roots @ S @ < --- S 6, is well known. Starting from this distribution, the suc- 
cessive moments of the sum of the q roots, 89 , say, have been studied by means of a recur 
rence relation. The lower order moments indicate that the distribution of ss can be ap- 
proximated by a Beta function of the form: const. sjlm*8@*DI-! (1 — s4/q)an tity 
(0 S seS q). For small values of g the approximation is satisfactory if m + n = 30 and for 
large values of n this distribution can be further approximated by a Gamma function with 
q{m + 3(q + 1)] degrees of freedom. This result has been established in two different ways, 
namely, using (1) the distribution of ss given above and (2) the method of characteristic 
function on an asymptotic joint distribution of the roots. T. W. Anderson following P. L. 
Hsu has obtained the asymptotic Gamma function distribution by another method. 


20. On a Problem in Multivariate Regression. THomas 8. Fercuson, Univer- 
sity of California, Berkeley. 


Consider s random variables & ,--- , & and n + 1 random variables m9 , 7 .--- , 7, such 
that the »; are independent of the & and also independent among themselves, but the ¢; 


are not necessarily independent among themselves. We assume that the &; and ; are non- 


s 
degenerate and that all have finite first moments which we assume to be zero. Let XY, = 


Vio) ade +; for j = 0,1, --- .n where the a; are arbitrary constants. In the case n = 1 
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the following result is obtained. Theorem 1:In order that the regression of Xo on X, be alinear 
function of X, irrespective of the values of the constants a;x , it is necessary and sufficient that 
the characteristic functions of m and (& , «++ , &) be of the form exp {—-K |u|” } and 
exp |—p’g(vi/p, «++ , Us/p)} respectively, where K and » are constants, K >0,1<»S2,p= 

vit, -++ , +0, andg is anarbitrary real function such that exp {—p’g(v:/p, «+: ,ve/p)} 
is the joint characteristic function of s nondegenerate random variables with zero means. The 
general result for n > 1 is Theorem 2:In order that the regression of Xo on X; , X2, +++ » Xn 
be linear in X, , X2, «++ , Xn irrespective of the values of the constants aj, , it is necessary 
and sufficient that each n; be normal and that & , «++ . &, have a multivariate normal distribu- 
tion. This paper extends the results of E. Fix (‘‘Distributions which lead to linear regres- 
sions,’’ Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability, 
University of California Press, 1949). 


21. On the Problem of Construction of Orthogonal Arrays. EstHEer SEIDEN, 
Universty of Chicago. 


The remarks made by O. Kempthorne, (Biometrika, Vol. 34 (1947)) and K. A. Brownlee 
and P. K. Loraine (‘‘The relationship between finite groups and completely orthogonal 
squares, cubes and hyper-cubes,’’ Biometrika, Vol. 35 (1948), pp. 277-282) regarding some of 
the multifactorial designs constructed by Plackett and Burman can be extended to all of 
them. In order to avoid confounding of main effects with first order interactions, only arrays 
of strength at least 3 should be used. It is shown that all the designs of Plackett and Bur- 
man, in which each factor takes on only two levels, form a scheme leading to the construc- 
tion of orthogonal arrays of strength 3 with the maximum possible numbers of constraints. 
An orthogonal array (36, 13, 3, 2) is constructed. It is known that the upper bound for the 
number of constraints is in this case 16. The method of construction used could not lead 
to a number of constraints greater than 13, but it is not known whether one would not do 
better using another one. 


22. The Joint Distribution of n Successive Amplitudes. (Preliminary Report.) 
W. C. Horrman, U. 8. Navy Electronics Laboratory, San Diego. 


The joint probability density function for two values of the output R(t) = |X?(t) + 
¥2(t)}4 of a linear detector (Lawson and Uhlenbeck, Threshold Signals, McGraw-Hill 
Book Co., 1950, p. 61, equation (72)) is generalized to the case of n such random variables, 
assuming a multivariate normal distribution for the input signals. The derivation depends 
in an essential manner on the following properties of elements of the inverse covariance 
matrix Ayt:A2—1.24-1 oe 22, (7, K we 1,2, --- , nm); AM ae —VBIBK-1 (7 = ke); A2—1-9F oe 
\2/.27-1 = Q. The joint probability density function for the n-dimensional case has the form 
f(r, +++ , tn) = | An lin, -** , tm exp {[— $ Zjay A 2-1 Q(r; » *** ot 3; F), where a, 
is the 2n X 2n covariance matrix of the input, l is the symmetric matrix (y;.) with yjz = 
| (A27—1,2k-1)2 4. (\2/-1.24)2)4 and Q is an infinite series each of whose terms consists of prod- 
ucts of modified Bessel functions of the first kind multiplied by the cosine of a weighted 
sum of the parameters ¢j, = Are tan (\*/~!.24/,2/-1.2k-1) | The subscripts of the Bessel func- 
tions range over all nonnegative integers but must satisfy certain linear relations. 


23. Simultaneous Tests for Regression Coefficients by the Two Stage Proce- 
dure. (Preliminary Report.) MANtnDRA NATH Guosu, University of North 
Carolina. 

In setting up a prediction equation of the form E(w) = a + 6x + vy + 42, the tests of 
significance of the hypothesis H,:8 = 0, H2i:y7 = 0, H3:6 = 0, by the usual method are not 
independent. Instead of combining these hypotheses and using an F-test, one would prefer 
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to make simultaneous decisions regarding the hypothesis before setting up the final predic- 
tion formula. The methods developed by Scheffé-Tukey-Bose-Roy of simultaneous confi- 
dence intervals have been employed for this purpose and a two-stage procedure along the 
lines of Stein (‘‘A two-sample test for a linear hypothesis whose power is independent of 
‘the variance,” Ann. Math. Stat., Vol. 16 (1945), pp. 248-258) has been developed to keep 
the probability of a wrong judgment regarding the hypotheses Hi: | 8 | > 8B) , He: |y | > 
yo, Hs: |6| > 60, less than a per cent, where Bo , 70 , 59 depend upon the relative cost of 
measuring the variables, and the variables z, y, z can be controlled for the purpose of the 
experiment. 


24. Optimum Sample Size for Choosing the Largest of (k + 1) Parameters 
from (k + 1) Otherwise Identically Distributed Populations. Paut N. 
SoMERVILLE, University of North Carolina. 


Assume we have (k + 1) populations, identically distributed except for unknown param- 
eters do 2 a; 2,°-: , 2 ay). Let it he required to take a preliminary sample of size (k +1)n 
with the object of deciding which population should be used for a further sample of size N. 
Let W(a; , ao) be the loss involved in choosing the population with parameter a; where 
W(a; , a) 2 0, W(ao , ao) = 0. Let C(n) be the cost of taking a preliminary sample. Then 
it is shown that under certain conditions the maximum expected loss over all values of 
a; ,i =0,1,2, --- , k, occurs where a; = a2 = --- = a, . This enables us to find the maxi- 
mum expected loss, which can then be minimized with respect to the preliminary sample 
size. 


25. Necessary Conditions for the Existence of Partially Balanced Incomplete 
Block Designs with Two Associate Classes. W. 8. Connor anp W. H. 
CLatwortuy, National Bureau of Standards. 


For a partially balanced incomplete block design with two associate classes and with 
parameters v, b, 7, k, m ,m2,Ar Ao, and pje (7,7, k = 1, 2), the following theorem has been 
proved. If (i) v > b, then it is necessary that (a) A be a perfect square and (b) eitherr — r,; = 
0, or r — rz = 0; (ii) v = 5 and v is even, then it is necessary that (a) A be a perfect square 
and (b) r — r, be a perfect square when a, is odd (wu = 1, 2); (iil) v = b, v is of the form 
4t+3(t=0,1,2, ---),and Ais not a perfect square, then it is necessary that (r — r)(r — re) 
be a perfect square, and (iv) v < b and v is even, then it is necessary that A be a perfect 
square where ry = 4{01 — A2)(—y + (—)*4/A) + (A; + r2)], (u=1,2),7 = Piz _— Piz , 


1 2 ° ° 
A=7?+ 28+ 1,8 = pi2 + pie, and a and a are nonnegative integers such that a, + 
a2 = v — 1. Examples are given of sets of parameters which fail to satisfy these conditions. 


26. Estimation in Truncated Bivariate Normal Distributions. (Preliminary Re- 
port.) A. C. Conen, Jr., University of Georgia. 


Maximum likelihood estimators of the parameters of a bivariate normal population are 
developed for samples which are subjected to a truncation on one of the variates at known 
terminals. Both single and double truncations with the number of missing (unmeasured) 
observations either known or unknown are considered. Asymptotic variances of the esti- 
mates are obtained from the likelihood information matrices. 


27. On a Class of Optimum Linear Predictors. R. F. Drentck anp P. NESBEDA, 
R. C. A. Victor Division, Camden, New Jersey. 


Prediction is the problem of projecting into the future a set of observed data in order to 
obtain an estimate for future observable data. For optimum prediction one assigns, through 
some considerations which are not part of the method, a loss function representing the 
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penalty for error. An optimum prediction procedure is the one which minimizes, in the long 
run, this penalty. N. Wiener pointed out that the optimum mean square predictor is linear 
if the interference affecting the observations has Gaussian probability distribution. By 
using a method of estimation due to Pitman (‘‘Estimation of the location and scale param- 
eters of a continuous population of any given form,’’ Biometrika, Vol. 30 (1939), pp. 391- 
421) the authors show that the class of linear predictors is characterized by the Gaussian 
probability distribution and by a loss function more general than r.m.s., namely, one which 
is symmetric and has continuous derivatives. Most of the loss functions of practical interest 
are in this category. Furthermore any such loss function leads to the same linear predictor 
X, which has also the property: P(| X, — x | S k) = max for all k > 0, 2 being the true 
value. (Work sponsored by the Bureau of Aeronautics.) 


28. Multiple Range Tests and the Multiple Comparisons Test. (Preliminary 
Report.) D. B. Duncan, Virginia Polytechnic Institute. 


Several methods are available for testing differences between treatments in an analysis 
of variance. The two considered most satisfactory are one by Newman (1952) and Keuls 
(1952) and the Multiple Comparisons Test by Duncan (1951). Both employ repeated ho- 
mogeneity tests. The Newman-Keuls test is simpler because it uses repeated range tests 
instead of F tests as used by the Multiple Comparisons Test. The latter is generally more 
sensitive owing partly to this reason but mostly to the relaxation of the significance levels 
of some of the tests considered to be of diminished importance. This paper presents: a 
new Multiple Range Test which achieves the siraplicity of the Newman-Keuls test by using 
range tests and most of the sensitivity of the Multiple Comparisons Test by using the special 
significance levels, and an improved set of application rules for the Multiple Comparisons 
Test. Each of these is recommended for use depending on the relative means for simplicity 
or sensitivity. The special system of significance levels is discussed in some detail. The 
author is indebted to W. Beyer in the determination of significance ranges for the new test 
which is still in progress. (Research under contract No. DA-36-034-ORD-1084 (RD) with 
the Office of Ordnance Research, Department of the Army.) 


29. A Property of the Normal Distribution Related to a Theorem of S. Bern- 
stein. (Preliminary Report.) EvGene LuKacs anp EpcGar P. Kina, Na- 
tional Bureau of Standards. 


The following theorem is proved. Let 2; , 22, +++ , 2, be nindependently (but not neces- 
sarily identically) distributed random variables and assume that the nth moment of each 
x; (@ = 1,2, --- , n) exists. The necessary and sufficient conditions for the existence of two 
statistically independent linear forms y; = D,L, a.z, and ys = DTP, b.z,[a, ~ 0; b, = 0; 
as/b. A a/b; for s # t;8,t = 1,2, +--+ , mn} are that each random variable be normally dis- 
tributed and that D2, a,b. = 0. For n = 2 this reduces to a theorem of S. Bernstein 
(‘Sur une propriété caractéristique de la loi de Gauss,’’ Trans. Leningrad Polytechnic 


Institute, (1941), pp. 21-22). 


30. An Asymptotically Efficient Formula for Estimating Parameters from 
Grouped Data. (Preliminary Report.) M. C. K. Tweepir, Virginia Poly- 
technic Institute. 


Suppose that, in a sample from a discrete or grouped distribution, z; observations fall 
in group t, whose probability is +:(@: , --- 0g), with i = 1 to N. The total sample size is 
n, = DX, (z;), and may be constant or determined sequentially. Write G = DM, 2g(X,), 
where X; = nz,(71, --+ Tr)/z; and g(X) is an arbitrary function of X approximately 
quadratic near X = 1. Anestimate of (@ , --- , 0g) may be obtained (usually by differentia- 
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tion) as the set of values of (7; , --- , Tn) for which G is least (or greatest, depending on g). 
Under normal conditions of regularity, with large samples the consequent estimates are 
effectively consistent and have minimum variance, and are, in Neyman’s terminology, 
BAN. (Cf. also H. Cramér Mathematical Methods of Statistics, Princeton University Press 
1946, §30.3). This formulation includes some important methods precisely, such as maxi 
mum likelihood [g(X) = log X] and minimum x? [g(X) = (1 — X)?/X]. Thus, as Fisher 
(Statistical Methods for Research Workers, Chapter IX) has shown, the recombination frac- 
tion can be estimated efficiently from F2 genetical data by both these methods, and also by 
a product-ratio formula (given by g(X) = X log X). In this problem an efficient linear es- 
timation equation results from using g(X) = (1 — X)?, equivalent to minimizing a modifica- 
tion of x? which has the observed frequencies in the denominators. 





NEWS AND NOTICES 


Reude~s are invited to submit to the Secretary of the Institute news items of interest 
Personal Items 


Mr. J. C. Bain, formerly Chief Statistician of the Abitibi Power & Paper 
Company, Ltd., Toronto, Canada is now Director of Operational Research 
Projects for the Associated Merchandising Corporation, New York. In this new 
position he will direct work in Operational Research and its allied fields applied 
to the operations of department stores. 

Dr. Allan Birnbaum, formerly Statistician of the National Foundation for 
Infantile Paralysis, is now Lecturer and Director of the Statistical Consulting 
Service in the Department of Mathematical Statistics, Columbia University. 

Dr. Robert W. Burgess, formerly Economist and Actuary, Western Electric 
Company, was retired under the age rule in July, 1952, and was a consultant 
in Business and Economic Statistics for the next six months. In February, 1953 
he became Director, Bureau of the Census, Washington, D. C. 

Dr. Enrique Cansado, after completing his appointment as Visiting Professor 
at the University of California at Los Angeles (1951-1952), is now working in 
Santiago, Chile as Visiting Professor of General Statistics at the new Inter Amer- 
ican Training Center of Economic and Financial Statistics (Avenida Republica 
517). 

Dr. George B. Dantzig is now Research Mathematician with the Rand Cor- 
poration, Santa Monica, California. 

Dr. Herbert A. David, who recently received his Ph.D. degree from the 
University of London, has been appointed to the Section of Mathematical 
Statistics of the Commonwealth Scientific and Industrial Research Organiza- 
tion, Australia and will be stationed at the Sheep Biology Laboratory, Prospect, 
N.S.W., Australia. 

Mr. Gregory M. Dillon, who was recalled to duty in the Army two years ago 
as a captain, has returned from Army Field Forces Board #4, Fort Bliss, Texas 
to his position in the Treasurer’s Department, E. I. duPont de Nemours and 
Company, Wilmington, Delaware. 

Mr. Robert Fagot, who has recently been recalled to active duty as an Aerol- 
ogy Officer in the Navy for a period of 18-24 months, is now attached to Fleet 
Weather Central, Kodiak, Alaska. 

Professor Wayne W. Gutzman, who was recently discharged from the Navy, 
has resumed his duties as Professor of Mathematics at the University of South 
Dakota. 

Dr. Gordon M. Harrington, formerly Research Associate with the Educational 
Research Corporation, Cambridge, Massachusetts, is now with the Connecticut 
State Department of Education at Consultant-in-Research. 

Professor Paul G. Homeyer has resumed his duties at the Statistical Labora- 
tory, Iowa State College, Ames, Iowa after serving as Statistical Lecturer and 
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Consultant for F, A. O., United Nations, at the Agricultural Research Station, 
Rehovat, Israel from July 22 to September 26, 1952. 

Mr. William R. Hydeman has accepted an appointment as Staff Mathemati- 
cian with the Engineering Research Associates Division of Remington Rand 
Inc. This appointment is at the recently established ERA Computation Center 
at Arlington, Virginia. 

Professor Raymond J. Jessen has resumed his duties at the Statistical Labora- 
tcry, lowa State College, Ames, Iowa after completing his assignment from 
July to October, 1952 co the Agricultural Statistics Training Center in Ecuador 
to plan and execute an agricultural survey on a sampling basis for demonstrating 
and teaching how modern survey techniques can be used in somewhat under- 
developed countries. 

Professor Dr. Hans Kellerver has recently been appointed Professor-in-Ordinary 
for Statistics at the Free University of Berlin, Germany. 

Dr. R. A. Leibler, formerly of Sandia Corporation, has accepted the position 
of Mathematician with the Department of Defense. 

Professor Kenneth May of Carleton College has been awarded a Ford Founda- 
tion Fellowship for the academic year 1953-54 for “the preparation of materials 
for the mathematical training of social scientists.” 

Mr. John W. Norton, has been transferred from his production supervision 
duties at Union Oil Company’s Oleum Refinery and assumed duties in the Manu- 
facturing Economics Group in the Head Office of Union Oil Company, Los 
Angeles. 

Mr. James A. Pierce is now associated with the Engineering Department of 
Beech Aircraft Corporation in Wichita, Kansas. 

Mr. David A. Probst, formerly of the Institute of Statistics, University of 
North Carolina, is completing Ph.D. work in Petroleum Geology under Dr. 
W. C. Krumbein, Northwestern University. 

Dr. Samuel Weiss, Executive Director of the American Statistical Associa- 
tion, has been elected Secretary of the Allied Social Science Associations for 
1953. 


(= RR ne 


Educational Testing Service 


The Educational Testing Service is offering for 1954-55 its seventh series of 
research fellowships in psychometrics leading to the Ph.D. degree at Princeton 
University. Open to men who are acceptable to the Graduate School of the 
University, the two fellowships each carry a stipend of $2,500 a year and are 
normally renewable. 

Fellows will be engaged in part-time research in the general area of psycho- 
logical measurement at the offices of the Educational Testing Service and will, 
in addition, carry a normal program of studies in the Graduate School. Compe- 
tence in mathematics and psychology is a prerequisite for obtaining these fellow- 
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ships. The closing date for completing applications is January 15, 1954. Informa- 
tion and application blanks will be available about November Ist and may be 
obtained from: Director of Psychometric Fellowship Program, Hducational Test- 
ing Service, 20 Nassau Street, Princeton, New Jersey. 


International Congress of Mathematicians 


The next International Congress of Mathematicians will be held in Amster- 
dam, September 2-9, 1954. Among the seven sections into which the Congress 
will be divided is one on Probability and Statistics. The fee is expected to be 
about $14.00 for regular members. Those who wish to attend the Congress, or 
contribute a short lecture, should write at once to the Secretariat, 2d Boerhaaves- 
traat 49, Amsterdam, The Netherlands, giving name and full address, degrees, 
qualifications, et>. 


Preliminary Actuarial Examinations Prize Awards 


The winners of the prize awards offered by the Society of Actuaries to the nine 
undergraduates ranking highest on the score of Part 2 of the 1953 Preliminary 
Actuarial Examination are as follows: 

First Prize of $200 

Broadwin, Emile Bernard. .. . Harvard University 

Additional Prizes of $100 

Fredkin, Donald R........ , ... New York University 
Gassner, Betty J sah and ais nlece + de sips ae a re 
Hessenthaler, Ruth. . es di Pembroke College 
Speake, Neal M......... ....... University of Michigan 
Stemer, Liga............ .............. Swarthmore College 
Traynor, Edwin A... . ' .... Holy Cross College 
Watson, Charles B. H...................... University of Toronto 
Zinger, Alexis ... University of Montreal 

The Society of Actuaries has authorized a similar set of nine prizes for the 
1954 examinations on Part 2. 

The Preliminary Actuarial Examinations consist of the following three exami- 
nations: 

Part 1. Language Aptitude Examination. 

(Reading comprehension, meaning of words and word relationships, anto- 
nyms, and verbal reasoning.) 

Part 2. General Mathematics Examination. 

(Algebra, trigonometry, coordinate geometry, differential and integral cal- 
culus.) 

Part 3. Special Mathematics Examination. 

(Finite differences, probability and statistics.) 

The 1954 Preliminary Actuarial Examinations will be prepared by the Educa- 

tional Testing Service and will be administered by the Society of Actuaries at 
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centers throughout the United States and Canada on May 12, 1954. The closing 
date for applications is March 15, 1954. 
Detailed information concerning the Examinations can be obtained from: 
The Society of Actuaries 
208 South LaSalle Street 
Chicago 4, Illinois 


a 


New Members 


The following persons have been elected to membership in the Institute 


February 19, 1953 to May 31, 1953 


Adams, Robert M., B.S. (John B. Stetson Univ.), Head of Data Analysis Section, Research 
Scientist (Mathematics), Radar Division, Defense Research Laboratory, University 
of Texas, Box I, University Station, Austin, Texas. 

Arnold, Harvey J., M.A. (Queen’s Univ.), Graduate Assistant, Institute of Statistics, 
State College, Raleigh, North Carolina, 404 Huron Street, Niagara Falls, Ontario, Can- 
ada 

Behrns, Vernon N., M.A. (Univ. of Buffalo), Mathematics Instructor, University of Buf- 
falo, 217 Kelvin Drive, Buffalo 23, New York. 

Berger, Arthur R., M.S. (Univ. of Illinois), Graduate Research Assistant, Department of 
Psychology, University of ['linois, 906 South Fourth Street, Champaign, Illinois. 

Binder, Arnold, B.S. (Merchart Marine Academy), Graduate Student, Department of 
Psychology, Stanford University, Stanford, California. 

Bradt, Russell N., M.A. (Univ. of Kansas), Graduate Student, Stanford University, Stan- 
ford, California, 634 Homer Avenue, Apartment 4, Palo Alto, California. 

Brakensiek, Donald L., M.S. (Univ. of Illinois), Graduate Student and Research Fellow 
in Agricultural Engineering, lowa State College, Ames, Lowa, 128 Lynn Avenue, Ames, 
lowa 

Bryant, Edward C., M.S. (Univ. of Wyoming), Associate Professor of Statistics, University 
of Wyoming, Laramie, Wyoming, /017 Gibbon Street, Laramie, Wyoming. 

Burkholder, Donald L., M.S. (Univ. of Wisconsin), University Fellow, University of Wis- 
consin, 1108. Brooks Street, Madison, Wisconsin 

Campopiano, Carmen N., M.S. (Rutgers Univ.), Project Engineer, Sperry Gyroscope Com 
pany, Great Neck, Long Island and Student, Department of Mathematical Statistics, 
Columbia University, 252-20 Shiloh Avenue, Bellerose 6, New York. 

Commins, William D. Jr., B.A. (Catholic Univ. of America), Graduate Student, Stanford 
University, Stanford, California, 1928 Varnum Street N.E., Washington 18, D.C. 

Curry, Nolan A., M.Ch.f. (Rensselaer Polytechnic Inst.), Assistant Quality Manager, 
Resin Bond Behr-Manning Corporation, Troy, New York, 17 Hawthorne Avenue, Troy, 
New York. 

de Caui, John S., M.B.A. (Univ. of Pennsylvania), Instructor, Department of Statistics, 
University of Pennsylvania, Philadelphia, Pennsylvania, 1118 South 46th Street, Phila- 
delphia 43, Pennsylvania 

DeGroot, Morris H., 3.8. (Roosevelt College, Chicago), Graduate Student, University of 
Chicago, Chicago, Illinois, 6219 South Ellis Avenue, Chicago 37, Illinois 

Edwards, Gerald, M.A. (Columbia Univ.), Graduate Student, Columbia University, 
New York, 1205 Eastern Parkway, Brooklyn 13, New York. 

Folop, Albert A., B.A. (Univ. of Oklahoma), Lieutenant, U.S. Navy attending Princeton 
University as a graduate student, 32 Bank Street, Princeton, New Jersey 
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Fox, Martin, A.B. (Univ. of Calif., Berkeley), Student & Computor, Statistical Laboratory, 
University of California, Berkeley, California, 2674 Sacramento Street, Berkeley 2, 
California. 

Gaylor, David W., M.S. (Iowa State College), Graduate Assistant, Statistical Laboratory, 
Iowa State College, Ames, Iowa, 3107 West Street, Ames, Iowa. 

Glazer, Harold, A.M. (Boston Univ.), Statistician, Harvard College Observatory, 10 
Vesta Road, Dorchester, Massachusetts. 

Gorman, Thomas A. Jr., B.S. (American Univ.), Mathematician, Statistical Engineering 
Laboratory, Bureau of Standards and Graduate Student at American University, 
3409-A New Mexico Avenue, N.W., Washington 16, D. C. 

Gran, Robert F., B.S. (Univ. of Washington), Graduate Student, University of Chicago, 
Chicago, Illinois, 2516 Central Street, Evanston, Illinois. 

Hellyer, Sydney, M.A. (Univ. of British Columbia), Graduate Student and Research As- 
sistant, Department of Psychology, Indiana University, Bloomington, Indiana. 

Hirnyck, William T., A.B. (Syracuse Univ.), Supervisor, Statistical Quality Control De- 
partment, Atlas Powder Company, Tamaqua, Pennsylvania, 547 Arlington Street, 
Tamaqua, Pennsylvania. 

Horne, Mary Alice, B.S. (Univ. of Georgia), Graduate Student, Research Assistant for 
Bureau of Business Research of University of Georgia, 106 Chestnut Street, Norton, 
Virginia. 

Hurst, David C., B.S. (Montana State College), Graduate Student, Experimental Statis 
tics, North Carolina State College, Raleigh, North Carolina, Institute of Statisties, 
North Carolina State College, Raleigh, North Carolina. 

Ives, William G. H., M.S. (Iowa State College), Agriculture Research Officer, Science 
Service, Division of Forest Biology, Canada Department of Agriculture and Graduate 
Student, Iowa State College, c/o Forest Biology Laboratory, Box 156, University of 
Manitoba, Winnipeg, Manitoba, Canada. 

Jaech, John L., M.S. (Univ. of Washington), Graduate Student and Teaching Fellow, 
University of Washington, Seattle, Washington. 5436—48th S.W., Seattle, Wash- 
ington. 

Jespersen, Howard W., M.S. (Univ. of Rochester), Graduate Assistant, c/o Statistical 
Laboratory, Iowa State College, Ames, Iowa. 

Johns, M. Vernon Jr., B.A. (Stanford Univ.), Graduate Student, Department of Mathe 
matical Statistics, Columbia University, New York, Room 623 Furnald Hall, Columbia 
University, New York 27, New York. 

Klose, Orval M., S.M. (Univ. of Chicago), Assistant Professor of Mathematics, Seattle 
University, Seattle, Washington (at present on leave), Graduate Student, 9960 Rainier 
Avenue, Seattle 8, Washington. 

Koopmans, Lambert H., A.B. (San Diego State College), Graduate Student, University of 
California, Berkeley, California, 8284 Golden Avenue, Lemon Grove 1, California. 

Kramer, Clyde Y., M.S. (Virginia Polytechnic Inst.), Graduate Student Instructor, Vir- 
ginia Polytechnic Institute, Blacksburg, Virginia, Statistical Laboratory, Virginia 
Polytechnic Institute, Blacksburg, Virginia. 

Lamphiear, Donald E., A.B. (The George Washington Univ.), Statistician, Office of the 
Assistant Director for Statistical Standards, Bureau of the Census, Washington 25, 
DG. 

Lipstein, Benjamin, B.A. (Brooklyn College), Mathematical Statistician, Office of Statis- 
tical Standards, U.S. Bureau of Labor Statistics, Washington 25, D. C., 628 Northamp- 
ton Drive, Silver Spring, Maryland. 

Lloyd, Stuart P., Ph.D. (Univ. of Illinois), Research Mathematician in ‘Probability, 
Statistics, Discrete Systems’’, Bell Telephone Laboratories, Inc., Murray Hill, New 
Jersey. 

Lomeli, Maria Guadalupe, M.S. (Universidad Nacional Autonema de Mexico), Graduate 
Assistant, Iowa State College, Ames, Iowa, Louisiana 93, Mexico 18, D. F., Mexico 
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Lundegard, Robert J., M.S. (Purdue Univ.), Graduate Student, Purdue University, FPHA 
202-1, West Lafayette, Indiana. 

Lynch, James W., A.B. (Univ. of Georgia), Graduate Student and Research Assistant, 
University of Georgia, Athens, Georgia, Route 2, Athens, Georgia. 

Madison, Ralph L., M.S. (Iowa State College), Statistician, Aeronautical Radio, Inc., 
1520 New Hampshire, N.W., Washington, D. C., 703 Chetworth Place, Alexandria, Vir- 
ginia. : 

Magee, Col. Richard H., M.B.A. (Harvard Univ.), Director, Statistical Control and Mar- 
ket Research, Standard Register Company, Dayton 1, Ohio. 

May, Francis B., M.B.A. (Univ. of Texas), Assistant Professor of Buisness Statistics and 
Statistician, Office of the President, The University of Texas, 605 West 26th Street, 
Austin, Texas. 

Mela, Donald F., M.A. (Univ. of Michigan), Staff Member, Operations Evaluation Group, 
Navy Department, 4209 South Four Mile Run Drive, Arlington 4, Virginia. 

Moy, Shu-Teh Chen, Ph.D. (Univ. of Michigan), Emmy Noether Fellow in Mathematics 
of Bryn Mawr College, 287 Waverly Avenue, Detroit 3, Michigan. 

Navarro, Joseph A., M.S. (Purdue Univ.), Research Assistant, Purdue University, 509-4 
Airport Road, West Lafayette, Indiana. 

Newman, Peter K., M.Sc. (Univ. of London), Research Associate, Stanford and Graduate 
Student, Department of Statistics, Stanford University, Stanford, California. 

Newton, Douglas V., B.S. (Whitworth College), Graduate Student and Pre-Doctoral 
Associate, Department of Mathematics, University of Washington, Seattle 5, Washing- 
ton. 

Patell, Rusi K. N., B.S. (Sind Univ., Pakistan), Graduate Student, Mathematical Statis- 
tics, Columbia University, New York, 202 International House, 500 Riverside Drive, 
New York 27, New York. 

Paul, Gilbert I., M.S. (Univ. of Alberta), Graduate Student, Experimental Statistics, 
Institute of Statistics, Box 5457, State College Station, Raleigh, North Carolina. 
Perry, Frederick M., B.S. (Univ. of Main), Research and Development Engineer (Noise 
and Information Theory), General Electric Advanced Electronics Development Center 

Cornel! University, Ithaca, New York, 2000 Pyle Road, Schenectady 3, New York. 

Pratt, John Winsor, A.B. (Princeton Univ.), Graduate Student, Department of Statistics, 
Stanford University, Stanford, California. 

Rabin, Jordan B., A.M. (Columbia Univ.), Statistician, RCA Service Company, Inc., 
2052 BN. John Russell Circle, Elkins Park 17, Pennsylvania. 

Read, Robert R., B.S. (Ohio State Univ.), Graduate Student, University of California, 
Berkeley, California, 2428 College Avenue, Berkeley 4, California. 

Richter, Donald L., A.B. (Bowdoin), Graduate Student, University of North Carolina, 
Raleigh, North Carolina, 1631 East 21st Street, Brooklyn, New York. 

Riordan, Frank S. Jr., Ph.D. (Univ. of Tennessee), Quality Control Supervisor, The Chem- 
strand Corporation, c/o E. 1. DuPont de Nemours and Company, Martinsville, Virginia. 

Rosenblatt, Judah I., B.A. (Johns Hopkins Univ.), Graduate Student, Columbia Uni- 
versity, New York, 1441 John Jay Hall, Columbia University, New York 27, New York. 

Roy, Anadi Ranjan, M.S. (Calcutta Univ.), Graduate Student, Department of Statistics, 
Stanford University, Stanford, California. 

Rumer, Evelyn L., M.A. (Univ. of Minnesota), Engineering Assistant, Statistical Methods 
Section, G. E. Laboratory, General Electric Company, 5 Union Street, Schenectady, 
New York. 

Rustagi, Jagdish S., M.A. (Delhi Univ., India), Graduate Student, Department of Statis- 
tics, Stanford University, Stanford, California. 

Saunders, Sam C., B.S. (Univ. of Oregon), Graduate Student and Teaching Fellow, De- 
partment of Mathematics, University of Washington, Seattle 5, Washington. 

Scheurer, Ernest M., B.A. (Reed College), Graduate Student and Teaching Fellow, De- 
partment of Mathematics, University of Washington, Seattle 5, Washington. 
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Sheehan, Franklin F., B.S. (Stanford Univ.), Graduate Student, Menlo College, Menlo 
Park, California. 
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sistant, Statistical Laboratory, University of California, Berkeley, California, 2433 
Oregon Street, Berkeley 5, California. 

Stephens, Rothwell, Ph.D. (Iowa), Professor of Mathematics, Knox College, 3 Mt. Lucas 
Road, Princeton, New Jersey. 

St. Pierre, Jacques, M.A. (Université de Montreal, Canada), Graduate Student, Depart- 
ment of Statistics, University of North Carolina, Chapel Hill, North Carolina, 309 
Ransom Street, Chapel Hill, North Carolina. 

Straughan, James H., B.A. (Univ. of Florida), Graduate Student, Psychology Depart 
ment, Indiana University, Bloomington, Indiana. 

Thompson, William A. Jr., B.A. (Univ. of Illinois), Research Assistant and Graduate 
Student, University of North Carolina, Chapel Hill, North Carolina, 190 Jackson 
Circle, Chapel Hill, North Carolina. 

Tischendorf, John A., M.S. (Purdue Univ.), Graduate Research Assistant and Graduate 
Student, Purdue University, West Lafayette, Indiana, 210 Waldron Street, West La- 
fayette, Indiana. 

Truax, Donald R., M.S. (Univ. of Washington), Graduate Student, University of Washing 
ton, Seattle, Washington, 3309 Hover Place, Seattle 5, Washington. 

Tweedie, Maurice C. K., M.S. (Univ. of Reading, England), Associate Professor of Statis 
ties, Department of Statistics, Virginia Polytechnic Institute, Blacksburg, Virginia. 

Wesler, Oscar, M.S. (New York Univ.), Graduate Student, Applied Mathematics and 
Statistics Laboratory, Stanford University, Stanford, California. 
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REPORT OF THE WASHINGTON MEETING OF THE INSTITUTE 


The Graduate School, U. 8S. Department of Agriculture, Washington, D. C., 
was host to the Spring meeting on April 29, 30 and May 1, 1953. The meeting 
was held jointly with the Biometric Society, Eastern North American Region. 
The following 106 members of the Institute attended. 


P. H. Anderson, Allan Birnbaum, C. I. Bliss, R. C. Bose, R. A. Bradley, G. M. Brier, 
G. L. Burrows, J. M. Cameron, J. F. Canu, B. A. Clarke, W. H. Clatworthy, A. C. Cohen, 
Jr., 8. E. Cohen, W.S. Connor, E. B. Cook, E. L. Cox, L. S. Crump, C. Daniel, D. A. Dar- 
ling, Besse Day, F. R. Del Priore, W. E. Deming, Cyrus Derman, D. B. Duncan, David 
Durand, R. A. Eckler, Benjamin Epstein, L. J. Gerende, Dorothy Gilford, Leon Gilford, 
Mary Goins, H. 8. Graf, 8. W. Greenhouse, J. A. Greenwood, Evelyn Grossman, W. J. 
Hall, Max Halperin, J. F. Hannan, M. H. Hansen, H. H. Harman, Bernard Harris, Boyd 
Harshbarger, Robert Hooke, Jacob Horowitz, D. G. Horvitz, J. S. Hunter, C. M. Jaeger, 
A. T. James, N. L. Johnson, Wayne Jones, A. E. Karp, D. G. Kendall, W. A. Kimball, Jr., 
In. P. King, C. F. Kossack, Gunnar Kulldorff, Boyd Ladd, Jack Laderman, Gilbert Lieber- 
man, J. E. Lieberman, Julius Lieblein, R. F. Link, Benjamin Lipstein, 8. P. Lloyd, Eugene 
Lukacs, John Mandel, C. L. Marks, R. D. Marshall, R. H. Matthias, Paul Meier, D. F. 
Mela, H. A. Meyer, R. H. Morris, Milton Morrison, Paul Nesbeda, M. L. Norden, J. G. 
Osborne, K. C. S. Pillai, P. R. Rider, Joan Rosenblatt, 8. N. Roy, Rose Sachs, Marion 
Sandomire, I. R. Savage, Henry Scheffé, M. A. Schneiderman, W. R. Simmons, Paul Somer- 
ville, J. R. Stehn, M. E. Terry, L. J. Tick, J. W. Tukey, Maurice Tweedie, G. W. Tyler, 
D. L. Wallace, 8. S. Wilks, Gerald Winston, M. A. Woodbury, W. J. Youden, W. B. Zacha- 
rias, Marvin Zelen. 


The program of the meeting was as follows. 


WEDNESDAY, APRIL 29, 1953 


9:00 A. M. Registration 
10:30 A. M. Statistics in the Physical Sciences. 


Chairman: Leon Gilford, Bureau of the Census. 

Papers: (1) The Application of Multivariate Quality Control toa Photographic Problem. 
J. Edward Jackson (with Robert H. Morris), The Eastman Kodak Com- 
pany, Rochester, N. Y. 

(2) Control and Measurement of Experiment Error. Joseph M. Cameron, Na- 

tional Bureau of Standards. 

Discussants: Milton Lk. Terry, Bell Telephone Laboratories, Murray Hill, N. J. and 

Carl F. Kossack, Purdue University. 


1:30 P. M. Welcome on Behalf of the Graduate School, United States Depart- 
ment of Agriculture. 


Chairman: Glenn L. Burrows, Bureau of Agricultural Economics. 
Speaker: Philip V. Cardon, Ph.D., Director, Graduate School, U.S. D. A. 


1:45 P. M. Contributed Papers for the Institute of Mathematical! Statistics. 
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Part A 


Chairman: Glenn L. Burrows, Bureau of Agricultural Economics. 
Papers: (1) Optimum Sample Sizes for Choosing the Largest of (k + 1) Means using 
Minimax Methods. Paul N. Somerville, University of North Carolina. 
(2) The Correspondence between Two Classes of Balanced Incomplete Block De- 
signs. W.S. Connor, National Bureau of Standards. 
A Finite Frequency Theory of Probability. A. H. Copeland, Sr., University 
of Michigan. 
Characterizations of Complete Classes of Tests of Some Multiparametric 
Hypotheses, with Applications to Likelihood Ratio Tests. Allan Birnbaum, 
Columbia University. 
Confidence Regions for the Location of the Vertex in Quadratic Regression 
(A Preliminary Report.) David L. Wallace, Princeton University. 
(6) The Noncentral Wishart Distribution. (A Preliminary Report.) A. T. 
James, Princeton University. 
(7) On Time-Dependent Waiting Line Processes. A. Bruce Clarke, University 
of Michigan. 


The following papers were presented by title. 


(8) Some Estimates which Minimize the Least Upper Bound of a Probability 
Together with the Cost of Observation. H. 8. Konijn, University of Cali- 
fornia, Berkeley. 

(9) On a Multivariate Analogue of Student’s t-Distribution, with Some Tables 
for the Bivariate Case. Charles W. Dunnett and Milton Sobel, Cornell 
University. 

(10) On the Completeness of Classes of Bayes’ Solutions. Lucien M. LeCam, 
University of California, Berkeley. 

(11) Identification and Estimation of Linear Structures with Symmetric Errors. 
T. A. Jeeves, University of California, Berkeley. 


Part B 


Chairman: David, D. Mason, Bureau of Plant Industry, Soils, and Agricultural Iin- 
gineering. 
Papers: (12) The Cramer-Smirnov Test in the Parametric Case. (A Preliminary Report. 
Donald A. Darling, Columbia University. 

(13) Asymptotic Solutions of the Compound Decision Problem for Two Completely 
Specified Populations. James F. Hannan, (with Herbert Robbins), Catho- 
lic University of America and University of North Carolina. 

On the Estimation of the Mean Life of a Radioactive Source. (A Preliminary 
Report.) Richard F. Link, Princeton University. 

5) The Use of the Questionnaire to Compare Two Populations for the Purpose 
of Improving the Course Content in a Mathematics Course for Business 
Teachers. Mary Goins, Marshall College. 

Minimax Decisions Regarding Mean of a Normal Variable with Unknown 
Variance. Manindra n. Ghosh, University of North Carolina. 

On Two-Stage Estimation Procedures. 8. G. Ghurye, (with Herbert Rob- 
bins), University of North Carolina and Institute for Advanced Study. 


The following papers were presented by title. 


(18) Estimation of the Location Parameter in the Structural Problem of Neyman 
T. A. Jeeves, University of California, Berkeley. 
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On the Distribution of the Sum of the Roots of a Determinantal Equation. 
K. C.S. Pillai, University of North Carolina. 
On a Problem in Multivariate Regression. Thomas 8. Ferguson, Uni- 
versity of California, Berkeley. 

) On the Problem of Construction of Orthogonal Arrays. Esther Seiden, 
University of Chicago. 
The Joint Distribution of n Successive Amplitudes. (A Preliminary Re- 
port.) W. C. Hoffman, U. 8. Navy Electronics Laboratory, San Diego. 
Simultaneous Tests for Regression Coefficients by the Two Staye Procedure. 
Manindra n. Ghosh, University of North Carolina. 
Optimum Sample Size for Choosing the Largest of (k + 1) Parameters from 
(k + 1) Otherwise Identically Distributed Populations. Paul N. Somerville, 
University of North Carolina. 


THURSDAY, APRIL 30, 1953 


9:30 A. M. Statistics in the Biological Sciences. 


Chairman: Max Halperin, National Heart Institute. 
Papers: (1) Factorial Chi Square Analysis of Data from Experiments in Immunology. 
H. C. Batson, University of Illinois College of Medicine. 
(2) Applications of Nonparametric Methods to Medical Data. Irwin D. J. 
Bross, Cornell University Medical College. 
(3) Stochastic Growth and Mutation Processes. David G. Kéndall, Magdalen 
College, Oxford and Princeton University. 
(4) Statistical Designs for the Efficient Removal of Trends Occurring in Com- 
parative Experiments with Applications in Biological Assay. George E. 
P. Box Imperial Chemical Industries, Manchester, England and North 
Carolina State College, (with W. A. Hay). 


1:30 P. M. Contributed Papers, Joint Session, for the Institute of Mathe- 
matical Statistics and the Biometric Society, Eastern North American Region. 


Chairman: Boyd Harshbarger, Virginia Polytechnic Institute. 
Papers: (1) Necessary Conditions for the Existence of Partially Balanced Incomplete 
Block Designs with Two Associate Classes. W. H. Clatworthy, (with W. 
S. Connor), National Bureau of Standards. 
(2) Estimation in Truncated Bivariate Normal Distributions. (A Preliminary 
Report.) A. C. Cohen, Jr., University of Georgia. 
(3) On a Class of Optimum Linear Predictors. R. F. Drenick, (with P. Nes- 
beda), R. C. A. Vietor Division, Camden, N. J. 
(4) Multiple Range Tests and the Multiple Comparisons Test. (A Preliminary 
Report.) David B. Duncan, Virginia Polytechnic Institute. 


3:00 P. M. Sequential Procedures. 


Chairman: Edward Paulson, Office of Naval Research 
Papers: (1) Sequential Procedures in Component of Variance Problems. Norman Lloyd 
Johnson, University College, London and University of North Carolina. 
(2) Sequential Estimation. M. C. Kenneth Tweedie, Virginia Polytechnic 
Institute. 


Discussant: Walter Jacobs, Department of the Air Force. 
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FRIDAY, MAY 1, 1953 


9:30 A. M. Contributed Papers for the Biometric Society, Eastern North Amer- 
ican Region. 


Chairman: Jerome Cornfield, National Cancer Institute. 
Papers: (1) The Analysis of Some Incomplete Block Designs with a Missing Block 
Marvin Zelen, National Bureau of Standards. 
The Fitting of Multi-Hit Survival Curves. Allyn W. Kimball, Oak Ridge 
National Laboratory. 
3) Estimating the Dose of a Cardiac Glycoside for Human Subjects. C. 1. Bliss, 
Connecticut Agricultural Experiment Station and Yale University, 
(with Theodore Greiner and Harry Gold, Cornell University Medical 
College). 
(4) On the Analysis of Variance of a T'wo-Way Classification with Unequal 
Sub-Class Numbers. Clyde Y. Kramer, Virginia Polytechnic Institute. 
(5) Ranking Versus Scoring in Palatability Tests Using Small, Trained Panels. 
Albert B. Parks, Bureau of Human Nutrition and Home Economics. 


11:15 A. M. Additional Contributed Papers for the Institute of Mathematical 
Statistics. 


(25) A Property of the Normal Distribution Related toa Theorem of S. Bernstein. 
(A Preliminary Report.) Mugene Lukacs and Edgar P. King, National 
Bureau of Standards. 

(26) An Asymptotically Efficient Formula for Estimating Parameters from 
Grouped Data. M. C. Kenneth Tweedie, Virginia Polytechnic Institute. 


1:30 P. M. Sampling. 


Chairman: Boyd Ladd, Bureau of the Budget. 
Papers: (1) Sampling for Time Series. Max A. Bershad (with William N. Hurwitz 
and Ralph 8. Woodruff), Bureau of the Census. 
(2) Response Rates and Selectivity in Mail Surveys. Walter A. Hendricks, 
Bureau of Agricultural Economics. 
Discussant: Earl E. Houseman, Bureau of Agricultural Economies. 


3:00 P. M. Simultaneous Confidence Interval Estimation and Testing of Hy- 
potheses. 


Chairman: William 8. Connor, National Bureau of Standards. 
Papers: (1) The General Theory and Application to Analysis of Variance and Covariance 
R. C. Bose, University of North Carolina. 
(2) Applications to Multivariate Situations. 8S. N. Roy, University of North 
Carolina. 
Discussants: David B. Duncan, Virginia Polytechnic Institute and Henry Scheffé, 
Columbia University. 
G. L. Burrows 
Assistant Secretary 





PUBLICATIONS RECEIVED 


Annuario Estadistico de Espana, (Instituto Nacional de Kstadistica), Presidencia del 
Gobierno, Madrid, 1952, xliii + 967 pp. 

Hoop, W. C. ano Koopmans, T. J., Studies in Econometric Methods, Cowles Commission 
Monograph No. 14, John Wiley and Sons, Inc., New York, 1953, $5.50. 

JouHNsoNn, ARNE I., Strength, Safety and Economical Dimensions of Structures, Bulletin of 
the Division gf Building Staties and Structural [Engineering at the Royal Institute of 
Technology, adihahin 1953, 159 pp. 

Lacey, O. L., Statistical Methods, Experimentation: An Introduction, The Maemillan Co., 
New York, 1953, $4.50. 


Recenseamento Geral do Brasil (1° de Setembro de 1940) Censo Demografico and Censos 


economicos, Servico Grafico de Instituto Brasileiro de Geografia e Estatistica, Rio de 
Janeiro, 1950. (4 volumes in addition to those listed in March and June 1953.) 

Rupin, Water, Principles of Mathematical Analysis, MeGraw-Hill Book Co., New York, 
1953, ix + 227 pp., $5.00. 

Wyuie, C. R., Jr., Calculus, MeGraw-Hill Book Co., New York, 1953, iii + 565 pp., $6.00 
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